mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-16 03:52:35 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	ebefef658e	Merge 10.11 into 11.2	2024-10-18 11:32:22 +03:00
Marko Mäkelä	eca552a1a4	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. In crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-18 10:12:47 +03:00
Sergei Golubchik	3a1cf2c85b	MDEV-34679 ER_BAD_FIELD uses non-localizable substrings	2024-10-17 21:37:37 +02:00
Marko Mäkelä	bb47e575de	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. A section of the test mariabackup.innodb_redo_overwrite that is parsing some mariadb-backup --backup output has been removed, because that output "redo log block is overwritten" would often be missing in a Microsoft Windows environment as a result of these changes. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff in the same way on both 32-bit and 64-bit architectures. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-17 17:24:20 +03:00
Marko Mäkelä	740519e15a	MDEV-35125: Unnecessary buf_pool.page_hash lookups dict_index_t::clear(), btr_drop_temporary_table(): Make use of the root page guess if it is available. btr_read_autoinc(): Invoke btr_root_block_get() to access the root page. btr_blob_free(): Retain a buffer-fix on the page across mtr_t::commit() in order to avoid a buf_pool.page_hash lookup. dict_load_table_one(): Remove a redundant check for page id. It was already validated in buf_page_t::read_complete(). trx_t::apply_log(): Make use of buf_pool.page_fix() to avoid some mtr_t related overhead. Reviewed by: Thirunarayanan Balathandayuthapani	2024-10-17 09:10:45 +03:00
Thirunarayanan Balathandayuthapani	4a1ded61a4	MDEV-34529 Shrink the system tablespace when system tablespace contains MDEV-30671 leaked undo pages - InnoDB fails to shrink the system tablespace when it contains the leaked undo log pages caused by MDEV-30671. - InnoDB does free the unused segment in system tablespace before shrinking the tablespace. InnoDB fails to free the unused segment if XA PREPARE transaction exist or if the previous shutdown was not with innodb_fast_shutdown=0 inode_info: Structure to store the inode page and offsets. fil_space_t::garbage_collect(): Frees the system tablespace unused segment fsp_get_sys_used_segment(): Iterates through all default file segment and index segment present in system tablespace. trx_sys_t::is_xa_exist(): Returns true if the XA transaction exist in the undo logs fseg_inode_free(): Frees the extents, fragment pages for the given index node and ignores any error similar to trx_purge_free_segment() trx_sys_t::reset_page(): Retain the TRX_SYS_FSEG_HEADER value in trx_sys page while resetting the page.	2024-10-16 21:34:24 +05:30
Monty	0403313bdb	Fixed connect to not call strlen() over and over again in a loop	2024-10-16 17:24:46 +03:00
Vladislav Vaintroub	c1fc59277a	MDEV-34929 page-compressed tables do not work on Windows Remove workaround for MDEV-13941, it served for 5 years,and all affected pre-release 10.2 installation should have been already fixed in between. Apparently Innodb is using is_sparse parameter in os_file_set_size() inconsistently, and it passes is_sparse=false now during first file extension. With MDEV-13941 workaround in place, it would unsparse the file, which is makes compression not to work at all anymore.	2024-10-16 16:02:13 +02:00
Marko Mäkelä	a4d2cc931d	MDEV-35174 Possible hang in trx_undo_prev_version() In commit `b7b9f3ce82` (MDEV-34515) we accidentally made the InnoDB MVCC code acquire a shared purge_sys.latch twice. Recursive shared latch acquisition may cause a deadlock of InnoDB threads if another thread in between will start waiting for an exclusive latch. purge_sys_t::latch: In debug builds, use srw_lock_debug instead of srw_spin_lock, so that bugs like this will result in debug assertion failures. trx_undo_report_row_operation(): Pass the view_guard to trx_undo_prev_version() and the rest of the arguments in the same order, so that the work to permute argument registers is minimized.	2024-10-16 14:37:44 +03:00
Thirunarayanan Balathandayuthapani	6aaae4c03b	MDEV-35122 Incorrect NULL value handling for instantly dropped BLOB columns Problem: ======= - Redundant table fails to insert into the table after instant drop blob column. Instant drop column only marking the column as hidden and consecutive insert statement tries to insert NULL value for the dropped BLOB column and returns the fixed length of the blob type as 65535. This lead to row size too large error. Fix: ==== For redundant table, if the non-fixed dropped column can be null then set the length of the field type as 0.	2024-10-15 12:04:37 +05:30
Yuchen Pei	cd5577ba4a	Merge branch '10.5' into 10.6	2024-10-15 16:00:44 +11:00
Yuchen Pei	77ed235d50	MDEV-26345 Spider GBH should execute original queries on the data node Stop skipping const items when selecting but skip them when storing their results to spider row to avoid storing in mismatching temporary table fields. Skip auxiliary fields in SELECTing, and do not store the (non-existing) results to the corresponding temporary table accordingly. When there are BOTH auxiliary fields AND const items in the auxiliary field items, do not use the spider GBH. This is a rare occasion if it happens at all and not worth the added complexity to cover it. Use the original item (item_ptr) in constructing GROUP BY and ORDER BY, which also means using item->name instead of field->field_name as aliases in constructing SELECT items. This fixes spurious regressions caused by the above changes in some tests using ORDER BY, such as mdev_24517.test. As a by-product, this also fixes MDEV-29546. Therefore we update mdev_29008.test to include the MDEV-29546 case.	2024-10-15 15:36:12 +11:00
Yuchen Pei	e6daff40e4	MDEV-32524 [fixup] Fixup of spider mem alloc enums missed in a previous merge The merge was 10.4.34->10.5.25	2024-10-15 14:30:40 +11:00
Nayuta Yanagisawa	6080e3af19	MDEV-26912 Spider: Remove dead code related to Oracle OCI Remove the dead-code, in Spider, which is related to the Spider's Oracle OCI support. The code has been disabled for a long time and it is unlikely that the code will be enabled.	2024-10-15 14:30:40 +11:00
Yuchen Pei	03a5c683f9	MDEV-27650 Spider: remove #ifdef SPIDER_HAS_GROUP_BY_HANDLER	2024-10-15 14:30:39 +11:00
Yuchen Pei	0a59aafc5f	MDEV-34659 Bound check in spider cast function query construction During spider query construction of certain cast functions, it locates the last occurrence of a keyword in the output of the Item::print() function and append from there to the constructed query so far. For example, consider the following query SELECT * FROM t2 ORDER BY CAST(c AS INET6); It constructs the following query and executes it at the data node (assuming the data node table is called t0). select cast(t0.`c` as inet6) ``,t0.`c` `c` from `test`.`t1` t0 order by `` When the construction has completed the initial part select cast(t0.`c` It then attempts to construct the " as inet6" part. To that end, it calls print() on the Item_typecast_fbt corresponding to the cast item, and obtains cast(`test`.`t2`.`c` as inet6) It then looks for " as ", and places cursor there for appending: cast(`test`.`t2`.`c` as inet6) ^ In this patch, if the search fails, i.e. there's no " as ...", we make sure that the cursor is not placed before the beginning of the string (out of bound). We also relax the search from " as char" to " as " in the case of CHAR_TYPECAST_FUNC, since there is more than one Item type with this func type. For example, "AS INET6" is an Item_typecast_fbt which has this func type.	2024-10-15 14:30:30 +11:00
Yuchen Pei	98a9c75ea3	MDEV-34659 Use evalp in CREATE SERVER's in init_spider.inc This was already fixed in higher versions.	2024-10-15 14:18:12 +11:00
Yuchen Pei	d3b84ff10d	MDEV-30067 Remove some overly enthusiastic asserts when deleting from a partitioned table When an DDL statement results in a local partition table with partitions not covering all values in the table, a failure is emitted. However, when the table in question is a spider table, the issue does not surface until some future statements (DELETE in the test examples in this commit) are executed. This is consistent with the design of spider which aims to minimise connections with the data node. The resulting error is legitimate and should not result in an assertion failure. Similarly, a partitioned spider table could have misplaced rows, so we remove the other assertion as well.	2024-10-15 14:18:10 +11:00
Yuchen Pei	8a52639ede	MDEV-34716 spider: some trivial cleanups and documentation - document tmp_share, which are temporary spider shares with only one link (no ha) - simplify spider_get_sys_tables_connect_info() where link_idx is always 0	2024-10-15 11:04:27 +11:00
Yuchen Pei	2345407b8c	MDEV-34716 Fix mysql.servers socket max length too short The limit of socket length on unix according to libc is 108, see sockaddr_un::sun_path, but in the table it is a string of max length 64, which results in truncation of socket and failure to connect by plugins using servers such as spider.	2024-10-15 10:50:22 +11:00
Yuchen Pei	84df8d7275	MDEV-34716 spider: some trivial cleanups and documentation - document tmp_share, which are temporary spider shares with only one link (no ha) - simplify spider_get_sys_tables_connect_info() where link_idx is always 0	2024-10-15 10:50:22 +11:00
Thirunarayanan Balathandayuthapani	5777d9f282	MDEV-35116 InnoDB fails to set error index for HA_ERR_NULL_IN_SPATIAL - InnoDB fails to set the index information or index number for the spatial index error HA_ERR_NULL_IN_SPATIAL. row_build_spatial_index_key(): Initialize the tmp_mbr array completely. check_if_supported_inplace_alter(): Fix the spelling mistake of alter	2024-10-14 14:28:24 +05:30
Marko Mäkelä	4e1e9ea6f3	MDEV-35124 Set innodb_snapshot_isolation=ON by default From the very beginning, the default InnoDB transaction isolation level REPEATABLE READ does not correspond to any well formed definition. The main issue is the lack of write/write conflict detection. To fix that and to make REPEATABLE READ correspond to Snapshot Isolation, `b8a6719889` introduced the Boolean session variable innodb_snapshot_isolation. It was disabled by default in order not to break any user applications. In a new major version of MariaDB Server, we had better enable this parameter by default.	2024-10-11 15:02:31 +03:00
Monty	875d8c909f	Removed end '.' from variable comment	2024-10-09 17:10:30 +03:00
Oleksandr Byelkin	1d0e94c55f	Merge branch '10.5' into 10.6	2024-10-09 08:38:48 +02:00
Thirunarayanan Balathandayuthapani	23820f1d79	MDEV-34392 Inplace algorithm violates the foreign key constraint - Fixing the compilation issue for the compiler lesser than gcc-6 Reviewed-by : Marko Mäkelä <marko.makela@mariadb.com>	2024-10-09 10:14:29 +05:30
Thirunarayanan Balathandayuthapani	65418ca9ad	MDEV-34392 Inplace algorithm violates the foreign key constraint - Fix the compilation error in gcc-5	2024-10-08 16:43:57 +05:30
Aleksey Midenkov	4e4c7dd4f5	MDEV-28288 System versioning doesn't support correct work for engine=connect and doesn't always give any warnings/errors Disabled system versioning for connect due to unsupported microseconds (MDEV-15967).	2024-10-08 13:08:10 +03:00
Aleksey Midenkov	5b940bdcfc	MDEV-25060 Freeing overrun buffer, various crashes, ASAN heap-buffer-overflow in _mi_put_key_in_record Rec buffer size depends on vreclength like this: length= MY_MAX(length, info->s->vreclength); The problem is rec buffer is allocated before vreclength is calculated. The fix reallocates rec buffer if vreclength changed. 1. Rec buffer allocated f0 mi_alloc_rec_buff (...) at ../src/storage/myisam/mi_open.c:738 f1 0x00005f4928244516 in mi_open (...) at ../src/storage/myisam/mi_open.c:671 f2 0x00005f4928210b98 in ha_myisam::open (...) at ../src/storage/myisam/ha_myisam.cc:847 f3 0x00005f49273aba41 in handler::ha_open (...) at ../src/sql/handler.cc:3105 f4 0x00005f4927995a65 in open_table_from_share (...) at ../src/sql/table.cc:4320 f5 0x00005f492769f084 in open_table (...) at ../src/sql/sql_base.cc:2024 f6 0x00005f49276a3ea9 in open_and_process_table (...) at ../src/sql/sql_base.cc:3819 f7 0x00005f49276a29b8 in open_tables (...) at ../src/sql/sql_base.cc:4303 f8 0x00005f49276a6f3f in open_and_lock_tables (...) at ../src/sql/sql_base.cc:5250 f9 0x00005f49275162de in open_and_lock_tables (...) at ../src/sql/sql_base.h:509 f10 0x00005f4927a30d7a in open_only_one_table (...) at ../src/sql/sql_admin.cc:412 f11 0x00005f4927a2c0c2 in mysql_admin_table (...) at ../src/sql/sql_admin.cc:603 f12 0x00005f4927a2fda8 in Sql_cmd_optimize_table::execute (...) at ../src/sql/sql_admin.cc:1517 f13 0x00005f49278102e3 in mysql_execute_command (...) at ../src/sql/sql_parse.cc:6180 f14 0x00005f49278012d7 in mysql_parse (...) at ../src/sql/sql_parse.cc:8236 2. vreclength calculated f0 ha_myisam::setup_vcols_for_repair (...) at ../src/storage/myisam/ha_myisam.cc:1002 f1 0x00005f49282138b4 in ha_myisam::optimize (...) at ../src/storage/myisam/ha_myisam.cc:1250 f2 0x00005f49273b4961 in handler::ha_optimize (...) at ../src/sql/handler.cc:4896 f3 0x00005f4927a2d254 in mysql_admin_table (...) at ../src/sql/sql_admin.cc:875 f4 0x00005f4927a2fda8 in Sql_cmd_optimize_table::execute (...) at ../src/sql/sql_admin.cc:1517 f5 0x00005f49278102e3 in mysql_execute_command (...) at ../src/sql/sql_parse.cc:6180 f6 0x00005f49278012d7 in mysql_parse (...) at ../src/sql/sql_parse.cc:8236 FYI backtrace was done with set print frame-info location set print frame-arguments presence set width 80	2024-10-08 13:08:10 +03:00
Marko Mäkelä	8a6a4c947a	Cleanup: Replace some is_predefined_tablespace() In some places, there were redundant comparisons against TRX_SYS_SPACE or SRV_TMP_SPACE_ID. The temporary tablespace is never the subject of log-based recovery. Also, consistently check for SRV_SPACE_ID_UPPER_BOUND. Reviewed by: Debarun Barerjee	2024-10-04 13:41:12 +03:00
Marko Mäkelä	b249a059da	MDEV-34850: Busy work while parsing FILE_ records In mariadb-backup --backup, we only have to invoke the undo_space_trunc and log_file_op callbacks as well as validate the mini-transaction checksums. There is absolutely no need to access recv_sys.pages or recv_spaces, or to allocate a decrypt_buf in case of innodb_encrypt_log=ON. This is what the new mode recv_sys_t::store::BACKUP will do. In the skip_the_rest: loop, the main thing is to process all FILE_ records until the end of the log is reached. Additionally, we must process INIT_PAGE and FREE_PAGE records in the same way as they would be during storing == YES. This was measured to reduce the CPU time between the messages "InnoDB: Multi-batch recovery needed at LSN" and "InnoDB: End of log at LSN" by some 20%. recv_sys_t::store: A ternary enumeration that specifies how records should be stored: NO, BACKUP, or YES. recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem(): Replace template<bool store> with template<store storing>. store_freed_or_init_rec(): Simplify some logic. We can look up also the system tablespace. Reviewed by: Debarun Banerjee	2024-10-04 13:38:21 +03:00
Yuchen Pei	690f8a91f9	MDEV-35073 Fix -Wmaybe-uninitialized in spider_conn_first_link_idx Default result to "no eligible servers".	2024-10-04 17:18:20 +10:00
Marko Mäkelä	f493e46494	Merge 11.6 into 11.7	2024-10-03 18:15:13 +03:00
Marko Mäkelä	43465352b9	Merge 11.4 into 11.6	2024-10-03 16:09:56 +03:00
Marko Mäkelä	b53b81e937	Merge 11.2 into 11.4	2024-10-03 14:32:14 +03:00
Marko Mäkelä	12a91b57e2	Merge 10.11 into 11.2	2024-10-03 13:24:43 +03:00
Marko Mäkelä	63913ce5af	Merge 10.6 into 10.11	2024-10-03 10:55:08 +03:00
Marko Mäkelä	7e0afb1c73	Merge 10.5 into 10.6	2024-10-03 09:31:39 +03:00
Yuchen Pei	ba7088d462	Merge '11.4' into 11.6	2024-10-03 15:59:20 +10:00
Marko Mäkelä	23debf214f	MDEV-28091 fixup: Fix another pfs_malloc() stub In commit `0f56e21efa` only one pfs_malloc() stub was fixed to return aligned memory. Also, the MSVC _aligned_malloc() pairs with _aligned_free().	2024-10-03 08:15:17 +03:00
Daniel Black	1f7898f686	mroonga: remove -Wunused-but-set-variable warnings There where unused variable. They were not conditional on defines, so removed them. Added an error handing in proc_object if there was no db as subsequent operations would have failed.	2024-10-03 15:05:09 +10:00
Daniel Black	3723fd1573	MDEV-35007 mroonga should modify source files during build CMake rewriting the tests causes Mroonga to be un-buildable on build environments where there source directory is read only. In the test results, the version wasn't particularly important. Remove the version dependence of tests.	2024-10-03 15:05:09 +10:00
Marko Mäkelä	cc70ca7eab	MDEV-35059 ALTER TABLE...IMPORT TABLESPACE with FULLTEXT SEARCH may corrupt the adaptive hash index build_fts_hidden_table(): Correct a mistake that had been made in commit `903ae30069` (MDEV-30655).	2024-10-02 11:09:31 +03:00
Sergei Golubchik	b1bbdbab9e	cleanup: remove redundant if() likely, a result of auto-merge of two fixes in different versions	2024-10-01 18:29:11 +02:00
Sergei Golubchik	813e592763	compilation failure in CONNECT storage/connect/tabfmt.cpp:419:24: error: '%.3d' directive writing between 3 and 10 bytes into a region of size 5 [-Werror=format-overflow=] 419 \| sprintf(buf, "COL%.3d", i+1);	2024-10-01 18:29:11 +02:00
Marko Mäkelä	464055fe65	MDEV-34078 Memory leak in InnoDB purge with 32-column PRIMARY KEY row_purge_reset_trx_id(): Reserve large enough offsets for accomodating the maximum width PRIMARY KEY followed by DB_TRX_ID,DB_ROLL_PTR. Reviewed by: Thirunarayanan Balathandayuthapani	2024-10-01 18:35:39 +03:00
Marko Mäkelä	a298dfb84c	MDEV-35053 Crash in purge_sys_t::iterator::free_history_rseg() purge_sys_t::get_page(): Avoid accessing a freed reference to pages[id] after pages.erase(id). This heap-use-after-free would sometimes be caught by AddressSanitizer. purge_sys_t::iterator::free_history_rseg(): Do not crash if undo=nullptr (the database is corrupted). Reviewed by: Debarun Banerjee	2024-10-01 15:03:04 +03:00
Marko Mäkelä	2d031f4a71	MDEV-34973 fixup for POWER,s390x xtest(): Correct the declaration.	2024-10-01 13:29:59 +03:00
Max Kellermann	6715e4dfe1	MDEV-34973: innobase/dict0dict: add `noexcept` to lock/unlock methods Another chance for cutting back overhead due to C++ exceptions being enabled; the `dict_sys_t` class is a good candidate because its locking methods are called frequently. Binary size reduction this time: text data bss dec hex filename 24448622 2436488 9473537 36358647 22ac9f7 build/release/sql/mariadbd 24448474 2436488 9473601 36358563 22ac9a3 build/release/sql/mariadbd	2024-10-01 09:53:16 +03:00
Max Kellermann	813123e3e0	MDEV-34973: innobase/lock0lock: add `noexcept` MariaDB is compiled with C++ exceptions enabled, and that disallows some optimizations (e.g. the stack must always be unwinding-safe). By adding `noexcept` to functions that are guaranteed to never throw, some of these optimizations can be regained. Low-level locking functions that are called often are a good candidate for this. This shrinks the executable a bit (tested with GCC 14 on aarch64): text data bss dec hex filename 24448910 2436488 9473185 36358583 22ac9b7 build/release/sql/mariadbd 24448622 2436488 9473537 36358647 22ac9f7 build/release/sql/mariadbd	2024-10-01 09:53:16 +03:00
Thirunarayanan Balathandayuthapani	cc810e64d4	MDEV-34392 Inplace algorithm violates the foreign key constraint Don't allow the referencing key column from NULL TO NOT NULL when 1) Foreign key constraint type is ON UPDATE SET NULL 2) Foreign key constraint type is ON DELETE SET NULL 3) Foreign key constraint type is UPDATE CASCADE and referenced column declared as NULL Don't allow the referenced key column from NOT NULL to NULL when foreign key constraint type is UPDATE CASCADE and referencing key columns doesn't allow NULL values get_foreign_key_info(): InnoDB sends the information about nullability of the foreign key fields and referenced key fields. fk_check_column_changes(): Enforce the above rules for COPY algorithm innobase_check_foreign_drop_col(): Checks whether the dropped column exists in existing foreign key relation innobase_check_foreign_low() : Enforce the above rules for INPLACE algorithm dict_foreign_t::check_fk_constraint_valid(): This is used by CREATE TABLE statement to check nullability for foreign key relation.	2024-10-01 09:41:56 +05:30
Max Kellermann	45298b730b	sql/handler: referenced_by_foreign_key() returns bool The method was declared to return an unsigned integer, but it is really a boolean (and used as such by all callers). A secondary change is the addition of "const" and "noexcept" to this method. In ha_mroonga.cpp, I also added "inline" to the two helper methods of referenced_by_foreign_key(). This allows the compiler to flatten the method.	2024-09-30 16:33:25 +03:00
Marko Mäkelä	d28ac3f82d	MDEV-34207: ALTER TABLE...STATS_PERSISTENT=0 fails to drop statistics commit_try_norebuild(): If the STATS_PERSISTENT attribute of the table is being changed to disabled, drop the persistent statistics of the table.	2024-09-30 15:27:38 +03:00
Sergei Golubchik	b88f1267e4	MDEV-33373 part 2: Unexpected ER_FILE_NOT_FOUND upon reading from logging table after crash recovery CSV engine shoud set my_errno if use it.	2024-09-30 13:50:51 +02:00
Marko Mäkelä	dd5ce6b0c4	MDEV-34450 os_file_write_func() is an overkill for ib_logfile0 log_file_t::read(), log_file_t::write(): Invoke pread() or pwrite() directly, so that we can give more accurate diagnostics in case of a failure, and so that we will avoid the overhead of setting up 5(!) stack frames and related objects. tpool::pwrite(): Add a missing const qualifier.	2024-09-30 13:36:38 +03:00
Daniel Black	f199dffe3b	MDEV-34937 s3 engine no longer functional on non-gcc builds Last commit on libmarias3 broke non-gcc builds by excluding the most important aspect, the snprintf being executed. Reviewer: Andrew Hutchings <andrew@mariadb.org> Ref: https://github.com/mariadb-corporation/libmarias3/pull/130	2024-09-30 09:23:30 +01:00
Yuchen Pei	282b92f0a2	MDEV-34589 Do not execute before queries in spider_db_mbase::rollback() Rollback is not supposed to fail. This prevents false failures in spider rollback.	2024-09-30 16:16:27 +10:00
Yuchen Pei	42735c557e	MDEV-34636 Spider: reset wide_handler->trx in two occasions ha_spider::update_create_info() ha_spider::append_lock_tables_list()	2024-09-30 15:52:08 +10:00
Yuchen Pei	f43ea935a1	MDEV-34636 Remove implementation of ha-spider::extra() with MERGE flags	2024-09-30 15:51:18 +10:00
Yuchen Pei	69874ee95c	MDEV-34828 Remove some obsolete cmake code related to the removed spider handlersocket support A fixup of MDEV-26858	2024-09-30 15:12:00 +10:00
Kristian Nielsen	35c732cdde	MDEV-25611: RESET MASTER causes the server to hang RESET MASTER waits for storage engines to reply to a binlog checkpoint requests. If this response is delayed for a long time for some reason, then RESET MASTER can hang. Fix this by forcing a log sync in all engines just before waiting for the checkpoint reply. (Waiting for old checkpoint responses is needed to preserve durability of any commits that were synced to disk in the to-be-deleted binlog but not yet synced in the engine.) Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com> Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>	2024-09-27 15:10:06 +02:00
Marko Mäkelä	2d3ddaef35	MDEV-34907 Bogus assertion failure and busy work while parsing FILE_ records A server that was running with innodb_log_file_size=96M and innodb_buffer_pool_size=6M had inserted some data into a table that was subsequently dropped. When the server was killed and restarted, an assertion failed in recv_sys_t::parse() while a FSP_SIZE change was unnecessarily being processed during the skip_the_rest: loop in recv_scan_log(). The ib_logfile0 contents was as follows: 1. The checkpoint start LSN points to the start of some mini-transaction. 2. There may be log records for modifying files for which a FILE_MODIFY had been written before the checkpoint. These records were "purged" by advancing the checkpoint. 3. At some point during the initial parsing with store=true the space reserved for recv_sys.pages will run out and recv_scan_log() would switch to the skip_the_rest: mode. 4. We encounter a log record for extending a tablespace that will be deleted a bit later. This would trip the bogus debug assertion. 5. Later on, there would be a FILE_DELETE record for this tablespace. 6. The checkpoint end LSN points to a possibly empty sequence of FILE_MODIFY records and a FILE_CHECKPOINT record. Recovery had parsed these records first, before rewinding to the checkpoint start LSN. 7. There could be further records following the FILE_CHECKPOINT record. Recovery will process all records until an inconsistency is found and it is assumed that the end of the circular ib_logfile0 was reached. recv_sys_t::parse(): For the template instantiation with store=false, remove a debug assertion that could fail in a multi-batch recovery, while recv_scan_log(false) would be in the skip_the_rest: loop. It is very well possible that we have not encountered all FILE_ records yet, and therefore we should not complain about unknown tablespaces. Reviewed by: Debarun Banerjee	2024-09-27 12:31:37 +03:00
Marko Mäkelä	6acada713a	MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems When using the default innodb_log_buffer_size=2m, mariadb-backup --backup would spend a lot of time re-reading and re-parsing the log. For reads, it would be beneficial to memory-map the entire ib_logfile0 to the address space (typically 48 bits or 256 TiB) and read it from there, both during --backup and --prepare. We will introduce the Boolean read-only parameter innodb_log_file_mmap that will be OFF by default on most platforms, to avoid aggressive read-ahead of the entire ib_logfile0 in when only a tiny portion would be accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON, because those platforms define a specific mmap(2) option for enabling such read-ahead and therefore it can be assumed that the default would be on-demand paging. This parameter will only have impact on the initial InnoDB startup and recovery. Any writes to the log will use regular I/O, except when the ib_logfile0 is stored in a specially configured file system that is backed by persistent memory (Linux "mount -o dax"). We also experimented with allowing writes of the ib_logfile0 via a memory mapping and decided against it. A fundamental problem would be unnecessary read-before-write in case of a major page fault, that is, when a new, not yet cached, virtual memory page in the circular ib_logfile0 is being written to. There appears to be no way to tell the operating system that we do not care about the previous contents of the page, or that the page fault handler should just zero it out. Many references to HAVE_PMEM have been replaced with references to HAVE_INNODB_MMAP. The predicate log_sys.is_pmem() has been replaced with log_sys.is_mmap() && !log_sys.is_opened(). Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the way that an open file handle to ib_logfile0 will be retained. In both code paths, log_sys.is_mmap() will hold. Holding a file handle open will allow log_t::clear_mmap() to disable the interface with fewer operations. It should be noted that ever since commit `685d958e38` (MDEV-14425) most 64-bit Linux platforms on our CI platforms (s390x a.k.a. IBM System Z being a notable exception) read and write /dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is persistent memory (mount -o dax). So, the memory mapping based log parsing that this change is enabling by default on Linux and FreeBSD has already been extensively tested on Linux. ::log_mmap(): If a log cannot be opened as PMEM and the desired access is read-only, try to open a read-only memory mapping. xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile(): Copy the InnoDB log in mariadb-backup --backup from a memory mapped file.	2024-09-26 18:47:12 +03:00
Denis Protivensky	231900e5bb	MDEV-34836: TOI on parent table must BF abort SR in progress on a child Applied SR transaction on the child table was not BF aborted by TOI running on the parent table for several reasons: Although SR correctly collected FK-referenced keys to parent, TOI in Galera disregards common certification index and simply sets itself to depend on the latest certified write set seqno. Since this write set was the fragment of SR transaction, TOI was allowed to run in parallel with SR presuming it would BF abort the latter. At the same time, DML transactions in the server don't grab MDL locks on FK-referenced tables, thus parent table wasn't protected by an MDL lock from SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction. In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not enough to trigger MDL conflict in Galera. InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to the fact that it was believed MDL locking should always produce conflicts correctly. The fix brings conflict resolution rules similar to MDL-level checks to InnoDB, thus accounting for the problematic case. Apart from that, wsrep_thd_is_SR() is patched to return true only for executing SR transactions. It should be safe as any other SR state is either the same as for any single write set (thus making the two logically equivalent), or it reflects an SR transaction as being aborting or prepared, which is handled separately in BF-aborting logic, and for regular execution path it should not matter at all. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-09-24 11:14:01 +02:00
Marko Mäkelä	971cf59579	Merge 10.6 into 10.11	2024-09-24 08:49:20 +03:00
Marko Mäkelä	638c62acac	MDEV-34983: Remove x86 asm from InnoDB Starting with GCC 7 and clang 15, single-bit operations such as fetch_or(1) & 1 are translated into 80386 instructions such as LOCK BTS, instead of using the generic translation pattern of emitting a loop around LOCK CMPXCHG. Given that the oldest currently supported GNU/Linux distributions ship GCC 7, and that older versions of GCC are out of support, let us remove some work-arounds that are not strictly necessary. If someone compiles the code using an older compiler, it will work but possibly less efficiently. srw_mutex_impl::HOLDER: Changed from 1U<<31 to 1 in order to work around https://github.com/llvm/llvm-project/issues/37322 which is specific to setting the most significant bit. srw_mutex_impl::WAITER: A multiplier of waiting requests. This used to be 1, which would now collide with HOLDER. fil_space_t::set_stopping(): Remove this unused function. In MSVC we need _interlockedbittestandset() for LOCK BTS.	2024-09-23 12:51:27 +03:00
Daniel Black	ac5cbaff66	Aria - correct type Aria transaction ids are uint16 rather than uint. Change the type to be more accurate.	2024-09-21 09:11:02 +10:00
mariadb-DebarunBanerjee	35d477dd1d	MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1 The issue is caused by a race between buf_page_create_low getting the page from buffer pool hash and buf_LRU_free_page evicting it from LRU. The issue is introduced in 10.6 by MDEV-27058 commit `aaef2e1d8c` MDEV-27058: Reduce the size of buf_block_t and buf_page_t The solution is buffer fix the page before releasing buffer pool mutex in buf_page_create_low when x_lock_try fails to acquire the page latch.	2024-09-20 20:26:43 +05:30
Marko Mäkelä	9ea7f7129a	MDEV-34909 DDL hang during SET GLOBAL innodb_log_file_size on PMEM log_t::persist(): Add a parameter holding_latch to specify whether the caller is already holding exclusive log_sys.latch, like log_write_and_flush() always is.	2024-09-20 15:29:56 +03:00
Lena Startseva	0a5e4a0191	MDEV-31005: Make working cursor-protocol Updated tests: cases with bugs or which cannot be run with the cursor-protocol were excluded with "--disable_cursor_protocol"/"--enable_cursor_protocol" Fix for v.10.5	2024-09-18 18:39:26 +07:00
Marko Mäkelä	7ea9e1358f	Merge 11.2 into 11.4	2024-09-18 08:07:22 +03:00
Marko Mäkelä	e782e416ac	Merge 10.11 into 11.2	2024-09-18 07:38:49 +03:00
Marko Mäkelä	64b75865d5	MDEV-34823 after-merge fix btr_cur_t::search_leaf(): Remove a redundant condition. This fixes up the merge commit `cfa9784edb`	2024-09-18 07:06:35 +03:00
Yuchen Pei	1f7f406b7b	Merge branch '11.2' into 11.4	2024-09-18 11:27:53 +10:00
Yuchen Pei	ff88633b9c	Merge branch '10.11' into 11.2	2024-09-18 10:45:26 +10:00
Yuchen Pei	cfa9784edb	Merge branch '10.11' into 11.2	2024-09-18 10:25:16 +10:00
Sergei Petrunia	5cd3fa81ef	Merge 10.11 -> 11.2	2024-09-17 12:34:33 +03:00
Julius Goryavsky	f176248d4b	Merge branch '10.6' into '10.11'	2024-09-17 06:23:10 +02:00
Julius Goryavsky	80fff4c6b1	Merge branch '10.5' into '10.6'	2024-09-16 16:39:59 +02:00
Marko Mäkelä	b187414764	Merge 10.6 into 10.11	2024-09-16 10:58:40 +03:00
Marko Mäkelä	4010dff058	mtr_t::log_file_op(): Fix -Wnonnull GCC 12.2.0 could issue -Wnonnull for an unreachable call to strlen(new_path). Let us prevent that by replacing the condition (type == FILE_RENAME) with the equivalent (new_path). This should also optimize the generated code, because the life time of the parameter "type" will be reduced.	2024-09-14 11:05:44 +03:00
Marko Mäkelä	e3f653ca66	MDEV-34750 fixup: -Wconversion on 32-bit log_t::resize_write_buf(): If d<0 and d>-length, d will fit in ssize_t, which is a signed 32-bit or 64-bit integer. Cast from int64_t to ssize_t to make this clear and to silence a compiler warning.	2024-09-14 10:35:28 +03:00
Marko Mäkelä	762ad01c7f	Merge 11.2 into 11.4	2024-09-13 13:09:23 +03:00
Marko Mäkelä	a74bea7ba9	MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables buf_page_t::read_complete(): Fix an incorrect condition that had been added in commit `aaef2e1d8c` (MDEV-27058). Also for compressed-only pages we must remember that buffered changes may exist. buf_read_page(): Correct the function comment; this is for a synchronous and not asynchronous read. Pass the parameter unzip=true to buf_read_page_low(), because each of our callers will be interested in the uncompressed page frame. This will cause the test encryption.innodb-compressed-blob to emit more errors when the correct keys for decrypting the clustered index root page are unavailable. Reviewed by: Debarun Banerjee	2024-09-12 10:52:55 +03:00
Marko Mäkelä	f168050e90	MDEV-34791 fixup: Avoid an infinite loop with ROW_FORMAT=COMPRESSED buf_pool_t::page_fix(): If a change buffer merge may be needed on a ROW_FORMAT=COMPRESSED page that exists in compressed-only format in the buffer pool, go ahead to decompress the block. This fixes an infinite loop. Reviewed by: Debarun Banerjee	2024-09-12 10:52:12 +03:00
Yuchen Pei	a8c5717223	Merge branch '10.6' into 10.11	2024-09-12 10:44:13 +10:00
Yuchen Pei	09b1269e4a	Merge branch '10.5' into 10.6	2024-09-12 10:17:51 +10:00
Monty	3ae4ecbfc5	MDEV-34867 engine S3 cause 500 error for huawei buckets Add support for removing the Content-Type header to the S3 engine. This is required for compatibility with some S3 providers. This also adds a provider option to the S3 engine which will turn on relevant compatibility options for specific providers. This was required for getting MariaDB S3 engine to work with "Huawei Cloud S3". To get Huawei S3 storage to work on has set one of the following S3 options: s3_provider=Huawei s3_ssl_no_verify=1 Author: Andrew Hutchings <andrew@mariadb.org>	2024-09-11 16:15:37 +03:00
Yuchen Pei	b168859d1e	Merge branch '10.6' into 10.11	2024-09-11 16:10:53 +10:00
Yuchen Pei	4a09e74387	Merge branch '10.5' into 10.6	2024-09-11 15:49:16 +10:00
Marko Mäkelä	f0de610d0c	Merge 10.11 into 11.2	2024-09-10 18:35:16 +03:00
Yuchen Pei	fe3432b3bd	MDEV-28009 Deprecate spider_table_crd_thread_count and spider_table_sts_thread_count These variables/parameters have the default read-only value of 1, and the only way to change them is through a command line flag together with a command line flag loading spider. After this change, the flag will have no effect.	2024-09-10 14:48:59 +10:00
Yuchen Pei	cc0faa1e3e	MDEV-31788 Factor functions to reduce duplication around spider_check_and_init_casual_read in ha_spider.cc factored out static functions: - spider_prep_loop - spider_start_bg - spider_send_queries	2024-09-10 11:52:26 +10:00
Yuchen Pei	0ba97e4dc6	MDEV-31788 Factor out calls to spider_ping_table_mon_from_table in ha_spider.cc	2024-09-10 11:52:26 +10:00
Yuchen Pei	9e1579788f	MDEV-31788 Factor spider locking and unlocking code around sending queries	2024-09-10 11:52:22 +10:00
Yuchen Pei	84067291b4	MDEV-28360 Spider: remove #ifdef SPIDER_use_LEX_CSTRING_for_KEY_Field_name	2024-09-10 11:19:19 +10:00
Yuchen Pei	f5b7c25e1e	MDEV-27643 Spider: remove #ifdef HA_CAN_BULK_ACCESS	2024-09-10 11:19:19 +10:00
Yuchen Pei	e7570c7759	MDEV-31788 Remove spider_file_pos They are for unnecessary debugging purposes only.	2024-09-10 11:19:18 +10:00
Yuchen Pei	a81f419b06	MDEV-27648 remove #define HASH_UPDATE_WITH_HASH_VALUE The functions called in blocks protected by this macro remain undefined as of 11.5 `c96b23f994`	2024-09-10 11:19:14 +10:00
Yuchen Pei	5d54e86c22	MDEV-26178 spider: delete spd_environ.h It's virtually empty now	2024-09-10 11:15:18 +10:00
Yuchen Pei	869c501ac3	MDEV-27644 Spider: remove HANDLER_HAS_DIRECT_AGGREGATE	2024-09-10 11:15:18 +10:00
Yuchen Pei	3a58291680	MDEV-27662 remove SPIDER_SUPPORT_CREATE_OR_REPLACE_TABLE	2024-09-10 11:15:17 +10:00
Yuchen Pei	84977868b1	MDEV-27809 remove SPIDER_I_S_USE_SHOW_FOR_COLUMN Show::Column() was added in MDEV-19772 `4156b1a260`	2024-09-10 11:15:17 +10:00
Yuchen Pei	6287fb6e17	MDEV-27652 remove #ifdef HA_HAS_CHECKSUM_EXTENDED handler::pre_calculate_checksum was added in MDEV-16249 `be5c432a42`	2024-09-10 11:15:17 +10:00
Yuchen Pei	e8a5553cef	MDEV-27808 remove SPIDER_LIKE_FUNC_HAS_GET_NEGATED get_negated() was introduced in MDEV-16707	2024-09-10 11:15:16 +10:00
Yuchen Pei	ab49b46d01	MDEV-27664 remove SPIDER_SQL_CACHE_IS_IN_LEX sql_cache was moved to lex in MDEV-11953 in `de745ecf29`	2024-09-10 11:15:16 +10:00
Yuchen Pei	a1e5ee9111	MDEV-27663 remove SPIDER_USE_CONST_ITEM_FOR_STRING_INT_REAL_DECIMAL_DATE_ITEM {STRING\|INT\|REAL\|DECIMAL\|DATE}_ITEM were replaced with CONST_ITEM in MDEV-14630 `c20cd68e60`	2024-09-10 11:15:16 +10:00
Yuchen Pei	5e98471df1	MDEV-27811: remove SPIDER_MDEV_16246 MDEV-16246 was fixed long ago. And this macro was removed in other versions too	2024-09-10 11:15:15 +10:00
Yuchen Pei	8c8684b17f	MDEV-28226 Remove HANDLER_HAS_NEED_INFO_FOR_AUTO_INC handler::need_info_for_auto_inc() was added in MDEV-7720 / MDEV-7726 in commit `dc17ac1638`	2024-09-10 11:15:15 +10:00
Yuchen Pei	affcb0713d	MDEV-26178 spider: remove PARTITION_HAS_GET_PART_SPEC This macro is unused, and not in 11.5 `c96b23f994`	2024-09-10 11:15:15 +10:00
Yuchen Pei	6d0d09ebc2	MDEV-26178 Spider: remove HANDLER_HAS_TOP_TABLE_FIELDS This macro is unused	2024-09-10 11:15:14 +10:00
Yuchen Pei	1cb75d9a33	MDEV-27660 Remove #ifdef SPIDER_HANDLER_START_BULK_INSERT_HAS_FLAGS The flag argument was added to handler::start_bulk_insert() in the MDEV-539 commit `ca2cdaad86`	2024-09-10 11:15:14 +10:00
Yuchen Pei	aaba68ac1e	MDEV-28896 Spider: remove #ifdef SPIDER_UPDATE_ROW_HAS_CONST_NEW_DATA new_data is const since at least 2017: `a05a610d60`	2024-09-10 11:15:14 +10:00
Yuchen Pei	f16c037753	MDEV-28895 Spider: remove #ifdef HANDLER_HAS_CAN_USE_FOR_AUTO_INC_INIT handler has can_use_for_auto_inc_init() since at latest 2017: `dc17ac1638`	2024-09-10 11:15:14 +10:00
Yuchen Pei	0650c87d9b	MDEV-27647 Spider: remove HANDLER_HAS_DIRECT_UPDATE_ROWS	2024-09-10 11:15:13 +10:00
Yuchen Pei	d5d65b948b	MDEV-26178 Spider: remove HA_EXTRA_HAS_HA_EXTRA_USE_CMP_REF HA_EXTRA_USE_CMP_REF is undefined, and remains so as of 11.5 `c96b23f994`	2024-09-10 11:15:13 +10:00
Yuchen Pei	de3dd942c0	MDEV-28894 Spider: remove #ifdef HA_EXTRA_HAS_STARTING_ORDERED_INDEX_SCAN HA_EXTRA_STARTING_ORDERED_INDEX_SCAN was added latest 2018: `921c5e9314`	2024-09-10 11:15:13 +10:00
Yuchen Pei	64581c83e8	MDEV-28893 Spider: remove #ifdef SPIDER_NET_HAS_THD net has thd since 2015 in `56aa19989f` for MDEV-6152	2024-09-10 11:15:12 +10:00
Yuchen Pei	ba9bebd719	MDEV-28892 remove #ifdef SPIDER_Item_args_arg_count_IS_PROTECTED arg_count was protected since 2015 in commit `afa1773439`	2024-09-10 11:15:12 +10:00
Yuchen Pei	05fafaf82d	MDEV-27646 remove SPIDER_HAS_HASH_VALUE_TYPE unifdef -DSPIDER_HAS_HASH_VALUE_TYPE -m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*	2024-09-10 11:15:12 +10:00
Daniel Black	ccb4bc7754	MDEV-33894: Resurrect innodb_log_write_ahead_size (postfix) os_file_log_maybe_unbuffered is now Linux only. Aso the stat st structure only used in linux. This avoids unused function/structure errors on FreeBSD.	2024-09-10 09:13:49 +10:00
Sergei Petrunia	2c3b298337	Merge 11.2 into 11.4	2024-09-09 14:40:02 +03:00
Sergei Petrunia	abd98336d2	Merge 10.11 -> 11.2	2024-09-09 13:50:38 +03:00
Thirunarayanan Balathandayuthapani	06ad31d4b6	MDEV-34855 Bootstrap hangs while shrinking the system tablespace Reason: ======= - During bootstrap, InnoDB shrinks the system tablespace while purge is still active on system tablespace. This could lead to deadlock. Fix: ==== Avoid System tablespace shrinking during bootstrap. Move the shrinking logic of system tablespace after purge gets shutdown.	2024-09-09 14:34:14 +05:30
Marko Mäkelä	b7b2d2bde4	Merge 10.5 into 10.6	2024-09-09 11:30:30 +03:00
Yuchen Pei	d002b1f503	Merge branch '10.6' into 10.11	2024-09-09 11:34:19 +10:00
Marko Mäkelä	f9f92b480e	Merge 10.6 into 10.11	2024-09-06 16:17:42 +03:00
Marko Mäkelä	f06060f5ed	Cleanup: Remove the function dict_remove_db_name()	2024-09-06 14:31:55 +03:00
Marko Mäkelä	024a18dbcb	MDEV-34823 Invalid arguments in ib_push_warning() In the bug report MDEV-32817 it occurred that the function row_mysql_get_table_status() is outputting a fil_space_t* as if it were a numeric tablespace identifier. ib_push_warning(): Remove. Let us invoke push_warning_printf() directly. innodb_decryption_failed(): Report a decryption failure and set the dict_table_t::file_unreadable flag. This code was being duplicated in very many places. We return the constant value DB_DECRYPTION_FAILED in order to avoid code duplication in the callers and to allow tail calls. innodb_fk_error(): Report a FOREIGN KEY error. dict_foreign_def_get(), dict_foreign_def_get_fields(): Remove. This code was being used in dict_create_add_foreign_to_dictionary() in an apparently uncovered code path. That ib_push_warning() call would pass the integer i+1 instead of a pointer to NUL terminated string ("%s"), and therefore the call should have resulted in a crash. dict_print_info_on_foreign_key_in_create_format(), innobase_quote_identifier(): Add const qualifiers. row_mysql_get_table_error(): Replaces row_mysql_get_table_status(). Display no message on DB_CORRUPTION; it should be properly reported at the SQL layer anyway.	2024-09-06 14:29:09 +03:00
Yuchen Pei	60b93cdd30	Merge branch '10.5' into 10.6	2024-09-06 13:52:57 +10:00
Libing Song	5bbda97111	MDEV-33853 Async rollback prepared transactions during binlog crash recovery Summary ======= When doing server recovery, the active transactions will be rolled back by InnoDB background rollback thread automatically. The prepared transactions will be committed or rolled back accordingly by binlog recovery. Binlog recovery is done in main thread before the server can provide service to users. If there is a big transaction to rollback, the server will not available for a long time. This patch provides a way to rollback the prepared transactions asynchronously. Thus the rollback will not block server startup. Design ====== - Handler::recover_rollback_by_xid() This patch provides a new handler interface to rollback transactions in recover phase. InnoDB just set the transaction's state to active. Then the transaction will be rolled back by the background rollback thread. - Handler::signal_tc_log_recover_done() This function is called after tc log is opened(typically binlog opened) has done. When this function is called, all transactions will be rolled back have been reverted to ACTIVE state. Thus it starts rollback thread to rollback the transactions. - Background rollback thread With this patch, background rollback thread is defered to run until binlog recovery is finished. It is started by innobase_tc_log_recovery_done().	2024-09-05 21:19:25 +03:00
Daniel Black	8024b8e4c1	MDEV-33091 pcre2 headers - handle columnstore From e735cf2ed7cefb2af36f10f3cb47dfc060789df3, the PCRE_INCLUDES changed to PCRE_INCLUDE_DIRS for consistency. The columnstore module depends on the old name. Create a mapping for the columnstore submodule. 10.6+ fix for submodule is: * https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/3304	2024-09-05 12:14:06 +10:00
Sergei Golubchik	b2ebe1cb7b	MDEV-33091 pcre2 headers aren't found on Solaris use pkg-config to find pcre2, if possible rename PCRE_INCLUDES to use PKG_CHECK_MODULES naming, PCRE_INCLUDE_DIRS	2024-09-05 12:14:06 +10:00
Marko Mäkelä	fe5829a121	MDEV-34446 SIGSEGV on SET GLOBAL innodb_log_file_size with memory-mapped log file log_t::resize_write(): Advance log_sys.resize_lsn and reset the resize_log offset to START_OFFSET whenever the memory-mapped buffer would wrap around. Previously, in case the initial target offset would be beyond the requested innodb_log_file_size, we only adjusted the offset but not the LSN. An incorrect LSN would cause log_sys.buf_free to be out of bounds when the log resizing completes. The log_sys.lsn_lock will cover the entire duration of replicating memory-mapped log for resizing. We just need a mutex that is compatible with the caller holding log_sys.latch. While the choice of mtr_t::finisher (for normal log writes) depends on mtr_t::spin_wait_delay, replicating the log during resizing is a rare operation where we can afford possible additional context switching overhead.	2024-09-04 14:24:30 +03:00
Marko Mäkelä	a5b80531fb	Merge 11.4 into 11.6	2024-09-04 10:38:25 +03:00
Marko Mäkelä	9f0b106631	MDEV-34845 buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed buf_flush_buffer_pool(): Wait for any pending asynchronous reads to complete. This assertion failed in a run where buf_read_ahead_linear() had been triggered in an SQL statement that was executed right before shutdown. Reviewed by: Debarun Banerjee	2024-09-03 18:22:10 +03:00
Marko Mäkelä	9878238f74	MDEV-34791: Redundant page lookups hurt performance btr_cur_t::search_leaf(): When the index root page is also a leaf page, we may need to upgrade our existing shared root page latch into an exclusive latch. Even if we end up waiting, the root page won't be able to go away while we hold an index()->lock. The index page may be split; that is all. btr_latch_prev(): Acquire the page latch while holding a buffer-fix and an index tree latch. Merge the change buffer if needed. Use buf_pool_t::page_fix() for this special case instead of complicating buf_page_get_low() and buf_page_get_gen(). row_merge_read_clustered_index(): Remove some code that does not seem to be useful. No difference was observed with regard to removing this code when a CREATE INDEX or OPTIMIZE TABLE statement was run concurrently with sysbench oltp_update_index --tables=1 --table_size=1000 --threads=16. buf_pool_t::unzip(): Decompress a ROW_FORMAT=COMPRESSED page. buf_pool_t::page_fix(): Handle also ROW_FORMAT=COMPRESSED pages as well as change buffer merge. Optionally return an error. Add a flag for suppressing a page latch wait and a special return value -1 to indicate that the call would block. This is the preferred way of buffer-fixing blocks. The functions buf_page_get_gen() and buf_page_get_low() are only being invoked with rw_latch=RW_NO_LATCH in operations on SPATIAL INDEX. buf_page_t: Define some static functions for interpreting state(). buf_page_get_zip(), buf_read_page(), buf_read_ahead_random(), buf_read_ahead_linear(): Remove the redundant parameter zip_size. We must look up the tablespace and can invoke fil_space_t::zip_size() on it. buf_page_get_low(): Require mtr!=nullptr. buf_page_get_gen(): Implement some lock downgrading during recovery. ibuf_page_low(): Use buf_pool_t::page_fix() in a debug check. We do wait for a page read here, because otherwise a debug assertion in buf_page_get_low() in the test innodb.ibuf_delete could occasionally fail. PageConverter::operator(): Invoke buf_pool_t::page_fix() in order to possibly evict a block. This allows us to remove some special case code from buf_page_get_low().	2024-09-03 14:15:57 +03:00
Marko Mäkelä	44733aa8cf	Merge 11.2 into 11.4	2024-08-29 19:10:38 +03:00
Marko Mäkelä	e91a799458	Merge 10.11 into 11.2	2024-08-29 16:02:57 +03:00
Marko Mäkelä	984606d747	MDEV-34750 SET GLOBAL innodb_log_file_size is not crash safe The recent commit `4ca355d863` (MDEV-33894) caused a serious regression for online InnoDB ib_logfile0 resizing, breaking crash-safety unless the memory-mapped log file interface is being used. However, the log resizing was broken also before this. To prevent such regressions in the future, we extend the test innodb.log_file_size_online with a kill and restart of the server and with some writes running concurrently with the log size change. When run enough many times, this test revealed all the bugs that are being fixed by the code changes. log_t::resize_start(): Do not allow the resized log to start before the current log sequence number. In this way, there is no need to copy anything to the first block of resize_buf. The previous logic regarding that was incorrect in two ways. First, we would have to copy from the last written buffer (buf or flush_buf). Second, we failed to ensure that the mini-transaction end marker bytes would be 1 in the buffer. If the source ib_logfile0 had wrapped around an odd number of times, the end marker would be 0. This was occasionally observed when running the test innodb.log_file_size_online. log_t::resize_write_buf(): To adjust for the resize_start() change, do not write anything that would be before the resize_lsn. Take the buffer (resize_buf or resize_flush_buf) as a parameter. Starting with commit `4ca355d863` we no longer swap buffers when rewriting the last log block. log_t::append(): Define as a static function; only some debug assertions need to refer to the log_sys object. innodb_log_file_size_update(): Wake up the buf_flush_page_cleaner() if needed, and wait for it to complete a batch while waiting for the log resizing to be completed. If the current LSN is behind the resize target LSN, we will write redundant FILE_CHECKPOINT records to ensure that the log resizing completes. If the buf_pool.flush_list is empty or the buf_flush_page_cleaner() is stuck for some reason, our wait will time out in 5 seconds, so that we can periodically check if the execution of SET GLOBAL innodb_log_file_size was aborted. Previously, we could get into a busy loop here while the buf_flush_page_cleaner() would remain idle.	2024-08-29 14:53:08 +03:00
Marko Mäkelä	cfcf27c6fe	Merge 10.6 into 10.11	2024-08-29 07:47:29 +03:00
Marko Mäkelä	0e76c1ba94	Merge 10.5 into 10.6	2024-08-28 15:51:36 +03:00
Marko Mäkelä	1ff6b6f0b4	MDEV-34802 Recovery fails to note some log corruption recv_recovery_from_checkpoint_start(): Abort startup due to log corruption if we were unable to parse the entire log between the latest log checkpoint and the corresponding FILE_CHECKPOINT record. Also, reduce some code bloat related to log output and log_sys.mutex. Reviewed by: Debarun Banerjee	2024-08-28 15:44:42 +03:00
Yuchen Pei	18d3f63a4e	MDEV-32627 Spider: use CONNECTION string in SQLDriverConnect This is the CS part of the implementation of MENT-2070.	2024-08-28 16:43:07 +10:00
Marko Mäkelä	bda40ccb85	MDEV-34803 innodb_lru_flush_size is no longer used In commit `fa8a46eb68` (MDEV-33613) the parameter innodb_lru_flush_size ceased to have any effect. Let us declare the parameter as deprecated and additionally as MARIADB_REMOVED_OPTION, so that there will be a warning written to the error log in case the option is specified in the command line. Let us also do the same for the parameter innodb_purge_rseg_truncate_frequency that was deprecated&ignored earlier in MDEV-32050. Reviewed by: Debarun Banerjee	2024-08-28 07:18:03 +03:00
Marko Mäkelä	e7bb9b7c55	MDEV-24923 fixup: Correct a function comment	2024-08-27 18:06:24 +03:00
Marko Mäkelä	48becffd07	Merge 10.5 into 10.6	2024-08-27 08:52:10 +03:00
Yuchen Pei	58bc83e1a7	[fixup] Spider: Restored lines accidentally deleted in MDEV-32157 Also restored a change that resulted in off-by-one, as well as appending the correctly indexed key_hint.	2024-08-27 15:36:39 +10:00
Marko Mäkelä	36ab75a498	MDEV-34515: Fix a bogus debug assertion purge_sys_t::stop_FTS(): Fix an incorrect debug assertion that commit `d58734d781` added. The assertion would fail if there had been prior invocations of purge_sys.stop_SYS() without purge_sys.resume_SYS(). The intention of the assertion is to check that number of pending stop_FTS() stays below 65536.	2024-08-27 07:27:24 +03:00
Marko Mäkelä	76f6b6d818	MDEV-34515: Reduce context switching in purge Before this patch, the InnoDB purge coordinator task submitted innodb_purge_threads-1 tasks even if there was not sufficient amount of work for all of them. For example, if there are undo log records only for 1 table, only 1 task can be employed, and that task had better be the purge coordinator. srv_purge_worker_task_low(): Split from purge_worker_callback(). trx_purge_attach_undo_recs(): Remove the parameter n_purge_threads, and add the parameter n_work_items, to keep track of the amount of work. trx_purge(): Launch purge worker tasks only if necessary. The work of one thread will be executed by this purge coordinator thread. que_fork_scheduler_round_robin(): Merged to trx_purge(). Thanks to Vladislav Vaintroub for supplying a prototype of this. Reviewed by: Debarun Banerjee	2024-08-26 12:23:17 +03:00
Marko Mäkelä	b7b9f3ce82	MDEV-34515: Contention between purge and workload In a Sysbench oltp_update_index workload that involves 1 table, a serious contention between the workload and the purge of history was observed. This was the worst when the table contained only 1 record. This turned out to be fixed by setting innodb_purge_batch_size=128, which corresponds to the number of usable persistent rollback segments. When we go above that, there would be contention between row_purge_poss_sec() and the workload, typically on the clustered index page latch, sometimes also on a secondary index page latch. It might be that with smaller batches, trx_sys.history_size() will end up pausing all concurrent transaction start/commit frequently enough so that purge will be able to make some progress, so that there would be less contention on the index page latches between purge and SQL execution. In commit `aa719b5010` (part of MDEV-32050) the interpretation of the parameter innodb_purge_batch_size was slightly changed. It would correspond to the maximum desired size of the purge_sys.pages cache. Before that change, the parameter was referring to a number of undo log pages, but the accounting might have been inaccurate. To avoid a regression, we will reduce the default value to innodb_purge_batch_size=127, which will also be compatible with innodb_undo_tablespaces>1 (which will disable rollback segment 0). Additionally, some logic in the purge and MVCC checks is simplified. The purge tasks will make use of purge_sys.pages when accessing undo log pages to find out if a secondary index record can be removed. If an undo page needs to be looked up in buf_pool.page_hash, we will merely buffer-fix it. This is correct, because the undo pages are append-only in nature. Holding purge_sys.latch or purge_sys.end_latch or the fact that the current thread is executing as a part of an in-progress purge batch will prevent the contents of the undo page from being freed and subsequently reused. The buffer-fix will prevent the page from being evicted form the buffer pool. Thanks to this logic, we can refer to the undo log record directly in the buffer pool page and avoid copying the record. buf_pool_t::page_fix(): Look up and buffer-fix a page. This is useful for accessing undo log pages, which are append-only by nature. There will be no need to deal with change buffer or ROW_FORMAT=COMPRESSED in that case. purge_sys_t::view_guard::view_guard(): Allow the type of guard to be acquired: end_latch, latch, or no latch (in case we are a purge thread). purge_sys_t::view_guard::get(): Read-only accessor to purge_sys.pages. purge_sys_t::get_page(): Invoke buf_pool_t::page_fix(). row_vers_old_has_index_entry(): Replaced with row_purge_is_unsafe() and row_undo_mod_sec_unsafe(). trx_undo_get_undo_rec(): Merged to trx_undo_prev_version_build(). row_purge_poss_sec(): Add the parameter mtr and remove redundant or unused parameters sec_pcur, sec_mtr, is_tree. We will use the caller's mtr object but release any acquired page latches before returning. btr_cur_get_page(), page_cur_get_page(): Do not invoke page_align(). row_purge_remove_sec_if_poss_leaf(): Return the value of PAGE_MAX_TRX_ID to be checked against the page in row_purge_remove_sec_if_poss_tree(). If the secondary index page was not changed meanwhile, it will be unnecessary to invoke row_purge_poss_sec() again. trx_undo_prev_version_build(): Access any undo log pages using the caller's mini-transaction object. row_purge_vc_matches_cluster(): Moved to the only compilation unit that needs it. Reviewed by: Debarun Banerjee	2024-08-26 12:23:06 +03:00
Marko Mäkelä	d58734d781	MDEV-34520 purge_sys_t::wait_FTS sleeps 10ms, even if it does not have to There were two separate Atomic_counter<uint32_t>, purge_sys.m_SYS_paused and purge_sys.m_FTS_paused. In purge_sys.wait_FTS() we have to read both atomically. We used to use an overkill solution for this, acquiring purge_sys.latch and waiting 10 milliseconds between samples. To make matters worse, the 10-millisecond wait was unconditional, which would unnecessarily suspend the purge_coordinator_task every now and then. It turns out that we can fold both "reference counts" into a single Atomic_relaxed<uint32_t> and avoid the purge_sys.latch. To assess whether std::memory_order_relaxed is acceptable, we should consider the operations that read these "reference counts", that is, purge_sys_t::wait_FTS(bool) and purge_sys_t::must_wait_FTS(). Outside debug assertions, purge_sys.must_wait_FTS() is only invoked in trx_purge_table_acquire(), which is covered by a shared dict_sys.latch. We would increment the counter as part of a DDL operation, but before acquiring an exclusive dict_sys.latch. So, a purge_sys_t::close_and_reopen() loop could be triggered slightly prematurely, before a problematic DDL operation is actually executed. Decrementing the counter is less of an issue; purge_sys.resume_FTS() or purge_sys.resume_SYS() would mostly be invoked while holding an exclusive dict_sys.latch; ha_innobase::delete_table() does it outside that critical section. Still, this would only cause some extra wait in the purge_coordinator_task, just like at the start of a DDL operation. There are two calls to purge_sys_t::wait_FTS(bool): in the above mentioned purge_sys_t::close_and_reopen() and in purge_sys_t::clone_oldest_view(), both invoked by the purge_coordinator_task. There is also a purge_sys.clone_oldest_view<true>() call at startup when no DDL operation can be in progress. purge_sys_t::m_SYS_paused: Merged into m_FTS_paused, using a new multiplier PAUSED_SYS = 65536. purge_sys_t::wait_FTS(): Remove an unnecessary sleep as well as the access to purge_sys.latch. It suffices to poll purge_sys.m_FTS_paused. purge_sys_t::stop_FTS(): Do not acquire purge_sys.latch. Reviewed by: Debarun Banerjee	2024-08-26 12:22:44 +03:00
Marko Mäkelä	9db2b327d4	MDEV-34759: buf_page_get_low() is unnecessarily acquiring exclusive latch buf_page_ibuf_merge_try(): A new, separate function for invoking ibuf_merge_or_delete_for_page() when needed. Use the already requested page latch for determining if the call is necessary. If it is and if we are currently holding rw_latch==RW_S_LATCH, upgrading to an exclusive latch may involve waiting that another thread acquires and releases a U or X latch on the page. If we have to wait, we must recheck if the call to ibuf_merge_or_delete_for_page() is still needed. If the page turns out to be corrupted, we will release and fail the operation. Finally, the exclusive page latch will be downgraded to the originally requested latch. ssux_lock_impl::rd_u_upgrade_try(): Attempt to upgrade a shared lock to an update lock. sux_lock::s_x_upgrade_try(): Attempt to upgrade a shared lock to exclusive. sux_lock::s_x_upgrade(): Upgrade a shared lock to exclusive. Return whether a wait was elided. ssux_lock_impl::u_rd_downgrade(), sux_lock::u_s_downgrade(): Downgrade an update lock to shared.	2024-08-23 13:27:50 +03:00
Monty	1f040ae048	MDEV-34043 Drastically slower query performance between CentOS (2sec) and Rocky (48sec) One cause of the slowdown is because the ftruncate call can be much slower on some systems. ftruncate() is called by Aria for internal temporary tables, tables created by the optimizer, when the upper level asks Aria to delete the previous result set. This is needed when some content from previous tables changes. I have now changed Aria so that for internal temporary tables we don't call ftruncate() anymore for maria_delete_all_rows(). I also had to update the Aria repair code to use the logical datafile size and not the on-disk datafile size, which may contain data from a previous result set. The repair code is called to create indexes for the internal temporary table after it is filled. I also replaced a call to mysql_file_size() with a pwrite() in _ma_bitmap_create_first(). Reviewer: Sergei Petrunia <sergey@mariadb.com> Tester: Dave Gosselin <dave.gosselin@mariadb.com>	2024-08-21 22:47:29 +03:00
Thirunarayanan Balathandayuthapani	22b48bb393	MDEV-34756 Validation of new foreign key skipped if innodb_alter_copy_bulk=ON - During copy algorithm, InnoDB should disable bulk insert operation if the table has foreign key relation and foreign key check is enabled.	2024-08-21 18:58:20 +05:30
Oleksandr Byelkin	492a7c2430	Merge branch '11.5' into 11.6	2024-08-21 15:13:47 +02:00
Oleksandr Byelkin	342fa29615	Merge branch '11.4' into 11.5	2024-08-21 11:52:54 +02:00
Oleksandr Byelkin	eb70e0d6e2	Merge branch '11.2' into 11.4	2024-08-21 09:30:54 +02:00
Oleksandr Byelkin	6197e6abc4	Merge branch '10.11' into 11.2	2024-08-21 07:58:46 +02:00
Oleksandr Byelkin	70afc62750	Merge branch '10.6' into 10.11	2024-08-20 10:00:39 +02:00
Marko Mäkelä	267c0fce56	Merge 10.5 into 10.6	2024-08-15 10:16:46 +03:00
Marko Mäkelä	e40dfcdd89	Fix clang++-19 -Wunused-but-set-variable	2024-08-15 10:13:49 +03:00
Marko Mäkelä	62bfcfd8b2	Merge 10.6 into 10.11	2024-08-14 11:36:52 +03:00
Marko Mäkelä	757c368139	Merge 10.5 into 10.6	2024-08-14 10:56:11 +03:00
Marko Mäkelä	4f8803c036	MDEV-34678 pthread_mutex_init() without pthread_mutex_destroy() When SUX_LOCK_GENERIC is defined, the srw_mutex, srw_lock, sux_lock are implemented based on pthread_mutex_t and pthread_cond_t. This is the only option for systems that lack a futex-like system call. In the SUX_LOCK_GENERIC mode, if pthread_mutex_init() is allocating some resources that need to be freed by pthread_mutex_destroy(), a memory leak could occur when we are repeatedly invoking pthread_mutex_init() without a pthread_mutex_destroy() in between. pthread_mutex_wrapper::initialized: A debug field to track whether pthread_mutex_init() has been invoked. This also helps find bugs like the one that was fixed by commit `1c8af2ae53` (MDEV-34422); one simply needs to add -DSUX_LOCK_GENERIC to the CMAKE_CXX_FLAGS to catch that particular bug on the initial server bootstrap. buf_block_init(), buf_page_init_for_read(): Invoke block_lock::init() because buf_page_t::init() will no longer do that. buf_page_t::init(): Instead of invoking lock.init(), assert that it has already been invoked (the lock is vacant). add_fts_index(), build_fts_hidden_table(): Explicitly invoke index_lock::init() in order to avoid a pthread_mutex_destroy() invocation on an uninitialized object. srw_lock_debug::destroy(): Invoke readers_lock.destroy(). trx_sys_t::create(): Invoke trx_rseg_t::init() on all rollback segments in order to guarantee a deterministic state for shutdown, even if InnoDB fails to start up. trx_rseg_array_init(), trx_temp_rseg_create(), trx_rseg_create(): Invoke trx_rseg_t::destroy() before trx_rseg_t::init() in order to balance pthread_mutex_init() and pthread_mutex_destroy() calls.	2024-08-14 07:54:15 +03:00
Marko Mäkelä	12b01d740b	MDEV-34707: BUF_GET_RECOVER assertion failure on upgrade buf_page_get_gen(): Relax the assertion once more. The LSN may grow by invoking ibuf_upgrade(), that is, when upgrading files where innodb_change_buffering!=none was used. The LSN may also have been recovered from a log that needs to be upgraded to the current format.	2024-08-13 08:20:18 +03:00
Jan Lindström	cd8b8bb964	MDEV-34594 : Assertion `client_state.transaction().active()' failed in int wsrep_thd_append_key(THD, const wsrep_key, int, Wsrep_service_key_type) CREATE TABLE [SELECT\|REPLACE SELECT] is CTAS and idea was that we force ROW format. However, it was not correctly enforced and keys were appended before wsrep transaction was started. At THD::decide_logging_format we should force used stmt binlog format to ROW in CTAS case and produce a warning if used binlog format was not ROW. At ha_innodb::update_row we should not append keys similarly as in ha_innodb::write_row if sql_command is SQLCOM_CREATE_TABLE. Improved error logging on ::write_row, ::update_row and ::delete_row if wsrep key append fails. Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>	2024-08-12 23:54:30 +02:00
Oleksandr Byelkin	f8a735d6d8	Merge branch '10.11' into mariadb-10.11.9	2024-08-09 08:53:20 +02:00
Oleksandr Byelkin	2e580dc2a8	Merge branch '10.6' into mariadb-10.6.19	2024-08-09 08:51:21 +02:00
Nikita Malyavin	25e2d0a6bb	MDEV-34632 Assertion failed in handler::assert_icp_limitations Assertion `table->field[0]->ptr >= table->record[0] && table->field[0]->ptr <= table->record[0] + table->s->reclength' failed in handler::assert_icp_limitations. table->move_fields has some limitations: 1. It cannot be used in cascade 2. It should always have a restoring pair. Rule 1 is covered by assertions in handler::assert_icp_limitations and handler::ptr_in_record (commit `30894fe9a9`). Rule 2 should be manually maintained with care. Hopefully, the rule 1 assertions may sometimes help as well. In ha_myisam::repair, both rules are broken. table->move_fields is used asymmetrically there: it is set on every param->fix_record call (i.e. in compute_vcols) but is restored only once, in the end of repair. The reason to updating field ptr's for every call is that compute_vcols can (supposedly) be called in parallel, that is, with the same table, but different records. The condition to "unmove" the pointers in ha_myisam::restore_vcos_after_repair is incorrect, when stored vcols are available, and myisam stores a VIRTUAL field if it's the only field in the table (the record cannot be of zero length). This patch solves the problem by "unmoving" the pointers symmetrically, in compute_vcols. That is, both rules will be preserved maintained.	2024-08-07 14:50:19 +02:00
Yuchen Pei	337307a943	spider_string: use c_ptr_safe() instead of ptr() in error messages Most other uses of ptr() are accompanied with a length, so we leave them alone.	2024-08-07 12:23:46 +02:00
Sergei Petrunia	ba0d8aeffa	Fix rocksdb.unique_check: do not have two threads waiting on the same name	2024-08-07 11:26:15 +03:00
Yuchen Pei	fa8ce92cc0	MDEV-34682 Return the return value of ddl recovery done in ha_initialize_handlerton Otherwise it could cause false negative when ddl recovery done is part of the plugin initialization	2024-08-07 15:13:08 +10:00
Yuchen Pei	bce3f3f628	MDEV-34682 Reset spider_hton_ptr in error mode of spider_db_init()	2024-08-07 15:10:56 +10:00
Oleksandr Byelkin	d6444022ca	Merge branch 'bb-11.5-release' into bb-11.6-release	2024-08-06 17:28:38 +02:00
Oleksandr Byelkin	ea75a0b600	Merge branch '11.4' into 11.5	2024-08-05 17:50:18 +02:00
Sergei Golubchik	6f357feaf0	suppress rocksdb warning	2024-08-04 17:28:03 +02:00
Oleksandr Byelkin	1640c9b06e	Merge branch '11.2' into 11.4	2024-08-04 17:27:48 +02:00
Oleksandr Byelkin	dced6cbdb6	Merge branch '11.1' into 11.2	2024-08-03 09:50:16 +02:00
mariadb-DebarunBanerjee	e515e80773	MDEV-34689 Redo log corruption at high load Issue: During mtr_t:commit, if there is not enough space available in redo log buffer, we flush the buffer. During flush, the LSN lock is released allowing other concurrent mtr to commit. After flush we reacquire the lock but use the old LSN obtained before check. It could lead to redo log corruption. As the LSN moves backwards with the possibility of data loss and unrecoverable server if the server aborts for any reason or if server is shutdown with innodb_fast_shutdown=2. With normal shutdown, recovery fails to map the checkpoint LSN to correct offset. In debug mode it hits log0log.cc:863: lsn_t log_t::write_buf() Assertion `new_buf_free == ((lsn - first_lsn) & write_size_1)' failed. In release mode, after normal shutdown, restart fails. [ERROR] InnoDB: Missing FILE_CHECKPOINT(8416546) at 8416546 [ERROR] InnoDB: Log scan aborted at LSN 8416546 Backup fails reading the corrupt redo log. [00] 2024-07-31 20:59:10 Retrying read of log at LSN=7334851 [00] FATAL ERROR: 2024-07-31 20:59:11 Was only able to copy log from 7334851 to 7334851, not 8416446; try increasing innodb_log_file_size Unless a backup is tried or the server is shutdown or killed immediately, the corrupt redo part is eventually truncated and there may not be any visible issues seen in release mode. This issue was introduced by the following commit. commit `a635c40648` MDEV-27774 Reduce scalability bottlenecks in mtr_t::commit() Fix: If we need to release latch and flush redo before writing mtr logs, make sure to get the latest system LSN after reacquiring the redo system latch.	2024-08-03 13:11:35 +05:30
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani	37119cd256	MDEV-29010 Table cannot be loaded after instant ALTER Reason: ====== - InnoDB fails to load the instant alter table metadata from clustered index while loading the table definition. The reason is that InnoDB metadata blob has the column length exceeds maximum fixed length column size. Fix: === - InnoDB should treat the long fixed length column as variable length fields that needs external storage while initializing the field map for instant alter operation	2024-08-01 18:58:43 +05:30
Thirunarayanan Balathandayuthapani	533e6d5d13	MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list Problem: ======== - After the commit `ada1074bb1` (MDEV-14398) fil_crypt_set_encrypt_tables() iterates through all tablespaces to fill the default_encrypt tables list. This was a trigger to encrypt or decrypt when key rotation age is set to 0. But import tablespace does call fil_crypt_set_encrypt_tables() unnecessarily. The motivation for the call is to signal the encryption threads. Fix: ==== ha_innobase::discard_or_import_tablespace: Remove the fil_crypt_set_encrypt_tables() and add the import tablespace to the default encrypt list if necessary	2024-07-31 14:13:38 +05:30
Thirunarayanan Balathandayuthapani	ee5f7692d7	MDEV-34357 InnoDB: Assertion failure in file ./storage/innobase/page/page0zip.cc line 4211 During InnoDB root page split, InnoDB does the following 1) First move the root records to the new page(p1) 2) Empty the root, insert the node pointer to the root page 3) Split the new page and make it as child nodes. 4) Finds the split record, allocate another new page(p2) to the index 5) InnoDB stores the record(ret) predecessor to the supremum record of the page (p2). 6) In page_copy_rec_list_start(), move the records from p1 to p2 upto the split record 6) Given table is a compressed row format page, InnoDB attempts to compress the page p2 and failed (due to innodb_compression_level = 0) 7) Since the compression fails, InnoDB gets the number of preceding records(ret_pos) of a record (ret) on the page (p2) 8) Page (p2) is a new page, ret points to infimum record. ret_pos can be 0. InnoDB have wrong condition that ret_pos shouldn't be 0 and returns corruption. InnoDB has similar wrong check in page_copy_rec_list_end()	2024-07-30 14:36:34 +05:30
Marko Mäkelä	1c8af2ae53	MDEV-34422 Corrupted ib_logfile0 due to uninitialized log_sys.lsn_lock In commit `bf0b82d24b` (MDEV-33515) the function log_t::init_lsn_lock() was removed. This was fine on those platforms where InnoDB uses futex-based mutexes (Linux, FreeBSD, OpenBSD, NetBSD, DragonflyBSD). Dave Gosselin debugged this on Apple macOS and submitted a fix where pthread_mutex_wrapper::pthread_mutex_wrapper() would invoke init(). We do not really need that; we only need to invoke lsn_lock.init() like we used to do before commit `bf0b82d24b`. This should be a no-op for the futex based mutexes, which intentionally rely on zero initialization. The missing pthread_mutex_init() call would cause race conditions and corruption of log_sys.buf because multiple threads could apparently hold log_sys.lsn_lock concurrently in log_t::append_prepare(). The error would be caught by a debug assertion in log_t::write_buf(), or in non-debug builds by the fact that the server cannot be restarted due to an apparently missing FILE_CHECKPOINT record (because it had been written to wrong offset in log_sys.buf). The failure in log_t::append_prepare() was caught on Microsoft Windows after enabling SUX_LOCK_GENERIC and therefore forcing the use of pthread_mutex_wrapper for the log_sys.lsn_lock. It appears to be fine to omit the pthread_mutex_init() call on GNU/Linux. log_t::create(): Invoke lsn_lock.init(). log_t::close(): Invoke lsn_lock.destroy(). To better catch this kind of issues in the future by simply defining SUX_LOCK_GENERIC on any platform, a separate debug instrumentation patch will be applied to the 10.6 branch later. Reviewed by: Debarun Banerjee	2024-07-30 11:58:02 +03:00
Thirunarayanan Balathandayuthapani	c038b3c05e	MDEV-34181 Instant table aborts after discard tablespace - commit `85db534731` (MDEV-33400) retains the instantness in the table definition after discard tablespace. So there is no need to assign n_core_null_bytes during instant table preparation unless they are not initialized.	2024-07-30 13:31:43 +05:30
Thirunarayanan Balathandayuthapani	cc8eefb0dc	MDEV-33087 ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently - During copy algorithm, InnoDB should use bulk insert operation for row by row insert operation. By doing this, copy algorithm can effectively build indexes. This optimization is disabled for temporary table, versioning table and table which has foreign key relation. Introduced the variable innodb_alter_copy_bulk to allow the bulk insert operation for copy alter operation inside InnoDB. This is enabled by default ha_innobase::extra(): HA_EXTRA_END_ALTER_COPY mode tries to apply the buffered bulk insert operation, updates the non-persistent table stats. row_merge_bulk_t::write_to_index(): Update stat_n_rows after applying the bulk insert operation row_ins_clust_index_entry_low(): In case of copy algorithm, switch to bulk insert operation. copy_data_error_ignore(): Handles the error while copying the data from source to target file.	2024-07-30 11:59:01 +05:30
Monty	4bf7c966b3	MDEV-34664: Add an option to fix InnoDB's doubling of secondary index cardinalities (With trivial fixes by sergey@mariadb.com) Added option fix_innodb_cardinality to optimizer_adjust_secondary_key_costs Using fix_innodb_cardinality disables the 'divide by 2' of rec_per_key_int in InnoDB that in effect doubles the Cardinality for secondary keys. This has the biggest effect for indexes where a few rows has the same key value. Using this may also cause table scans for very small tables (which in some cases may be better than an index scan). The user visible effect is that 'SHOW INDEX FROM table_name' will for InnoDB show the true Cardinality (and not 2x the real value). It will also allow the optimizer to chose a better index in some cases as the division by 2 could have a bad effect for tables with 2-5 identical values per key. A few notes about using fix_innodb_cardinality: - It has direct affect for SHOW INDEX FROM table_name. SHOW INDEX will also update the statistics in table share. - The effect of fix_innodb_cardinality for query plans or EXPLAIN is only visible after first open of the table. This is why one must do a flush tables or use SHOW INDEX for the option to take effect. - Using fix_innodb_cardinality can thus affect all user in their query plans if they are using the same tables. Because of this, it is strongly recommended that one uses optimizer_adjust_secondary_key_costs=fix_innodb_cardinality mainly in configuration files to not cause issues for other users.	2024-07-29 16:40:53 +03:00
Marko Mäkelä	7e5c9ccda5	MDEV-34502 fixup: Do not cripple MSAN We need to work around deficiencies of Valgrind, and apparently the previous work-around attempts (such as `d247d64988`) do not work anymore, definitely not on recent clang-based compilers. MemorySanitizer should be fine; unfortunately we set HAVE_valgrind for it as well.	2024-07-29 15:04:16 +03:00
Marko Mäkelä	7ead48a72b	MDEV-34458: Remove more traces of BTR_MODIFY_PREV In commit `2f6df93748` we fixed an observed case of the bug by removing some code related to the no longer needed BTR_MODIFY_PREV mode. In commit `73ad436e16` an alternative fix was applied that also fixes the BTR_SEARCH_PREV case. Let us clean up some implicit references to BTR_MODIFY_PREV that were missed in `2f6df93748`. btr_pcur_move_backward_from_page(): Assume that the latch mode was BTR_SEARCH_LEAF. btr_pcur_move_to_prev(): Assert that the latch mode is BTR_SEARCH_LEAF. This function is mostly invoked in row0sel.cc for read operations, as well as in row0merge.cc for reading from the clustered index. All callers indeed use a cursor in the BTR_SEARCH_LEAF mode.	2024-07-29 14:13:30 +03:00
Thirunarayanan Balathandayuthapani	3359ac09a4	MDEV-34066 Output of SHOW ENGINE INNODB STATUS uses the nanoseconds suffix for microseconds - This issue is caused by commit `e71e613353` (MDEV-24671). Change the output of transaction lock wait time in microseconds suffix.	2024-07-23 21:36:13 +05:30
Oleksandr Byelkin	0fe39d368a	Merge branch '10.6' into 10.11	2024-07-22 15:14:50 +02:00
Oleksandr Byelkin	88711ee509	New columnstore 23.10.2	2024-07-19 14:17:08 +02:00
Oleksandr Byelkin	9af2caca33	Merge branch '10.5' into 10.6	2024-07-18 16:25:33 +02:00
Sergei Golubchik	d20518168a	also protect the /*!999999 sandbox comment	2024-07-17 21:25:40 +02:00
Sutou Kouhei	383d53edbc	MDEV-21166 Add Mroonga initialized check to Mroonga UDFs Mroonga UDFs can't be used without loading Mroonga.	2024-07-17 15:52:58 +10:00
Yuchen Pei	03a350378a	MDEV-30408 Reset explicit_limit in exists2in Item_exists_subselect::fix_length_and_dec() sets explicit_limit to 1. In the exists2in transformation it resets select_limit to NULL. For consistency we should reset explicity_limit too. This fixes a bug where spider table returns wrong results for queries that gets through exists2in transformation when semijoin is off.	2024-07-17 08:54:40 +08:00
Yuchen Pei	8416fd323c	MDEV-32627 [fixup] Spider: Fix conn key length To avoid off-by-one in spider_get_share. And document conn key and conn key length.	2024-07-17 08:52:35 +08:00

... 2 3 4 5 6 ...

28346 commits