mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-03-26 08:58:40 +01:00

Author	SHA1	Message	Date
Marko Mäkelä	15700f54c2	Merge 11.4 into 11.7	2025-01-09 09:41:38 +02:00
Marko Mäkelä	17f01186f5	Merge 10.11 into 11.4	2025-01-09 07:58:08 +02:00
Marko Mäkelä	420d9eb27f	Merge 10.6 into 10.11	2025-01-08 12:51:26 +02:00
Thirunarayanan Balathandayuthapani	f8cf493290	MDEV-34898 Doublewrite recovery of innodb_checksum_algorithm=full_crc32 encrypted pages does not work - InnoDB fails to recover the full crc32 encrypted page from doublewrite buffer. The reason is that buf_dblwr_t::recover() fails to identify the space id from the page because the page has been encrypted from FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION bytes. Fix: === buf_dblwr_t::recover(): preserve any pages whose space_id does not match a known tablespace. These could be encrypted pages of tablespaces that had been created with innodb_checksum_algorithm=full_crc32. buf_page_t::read_complete(): If the page looks corrupted and the tablespace is encrypted and in full_crc32 format, try to restore the page from doublewrite buffer. recv_dblwr_t::recover_encrypted_page(): Find the page which has the same page number and try to decrypt the page using space->crypt_data. After decryption, compare the space id. Write the recovered page back to the file.	2025-01-07 19:33:56 +05:30
Monty	88d9348dfc	Remove dates from all rdiff files	2025-01-05 16:40:11 +02:00
Oleksandr Byelkin	b12ff287ec	Merge branch '11.6' into 11.7	2024-11-10 19:22:21 +01:00
Oleksandr Byelkin	9e1fb104a3	MariaDB 11.4.4 release -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA= =s+WW -----END PGP SIGNATURE----- Merge tag '11.4' into 11.6 MariaDB 11.4.4 release	2024-11-08 07:17:00 +01:00
Sergei Golubchik	7feec30939	relax the XA recovery error it's just a suggestion anyway, not a bullet-proof check, let's not act as if it is	2024-11-05 14:00:52 -08:00
Oleksandr Byelkin	c770bce898	Merge branch '11.2' into 11.4	2024-10-30 15:11:17 +01:00
Marko Mäkelä	ebefef658e	Merge 10.11 into 11.2	2024-10-18 11:32:22 +03:00
Marko Mäkelä	eca552a1a4	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. In crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-18 10:12:47 +03:00
Marko Mäkelä	bb47e575de	MDEV-34830: LSN in the future is not being treated as serious corruption The invariant of write-ahead logging is that before any change to a page is written to the data file, the corresponding log record must must first have been durably written. On crash recovery, there were some sloppy checks for this. Let us implement accurate checks and flag an inconsistency as a hard error, so that we can avoid further corruption of a corrupted database. For data extraction from the corrupted database, innodb_force_recovery can be used. Before recovery is reading any data pages or invoking buf_dblwr_t::recover() to recover torn pages from the doublewrite buffer, InnoDB will have parsed the log until the final LSN and updated log_sys.lsn to that. So, we can rely on log_sys.lsn at all times. The doublewrite buffer recovery has been refactored in such a way that the recv_sys.dblwr.pages may be consulted while discovering files and their page sizes, but nothing will be written back to data files before buf_dblwr_t::recover() is invoked. A section of the test mariabackup.innodb_redo_overwrite that is parsing some mariadb-backup --backup output has been removed, because that output "redo log block is overwritten" would often be missing in a Microsoft Windows environment as a result of these changes. recv_max_page_lsn, recv_lsn_checks_on: Remove. recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging condition at the end of the recovery. recv_dblwr_t::validate_page(): Keep track of the maximum LSN (if we are checking a non-doublewrite copy of a page) but do not complain LSN being in the future. The doublewrite buffer is a special case, because it will be read early during recovery. Besides, starting with commit `762bcb81b5` the dblwr=true copies of pages may legitimately be "too new". recv_dblwr_t::find_page(): Find a valid page with the smallest FIL_PAGE_LSN that is in the valid range for recovery. recv_dblwr_t::restore_first_page(): Replaced by find_page(). Only buf_dblwr_t::recover() will write to data files. buf_dblwr_t::recover(): Simplify the message output. Do attempt doublewrite recovery on user page read error. Ignore doublewrite pages whose FIL_PAGE_LSN is outside the usable bounds. Previously, we could wrongly recover a too new page from the doublewrite buffer. It is unlikely that this could have lead to an actual error. Write back all recovered pages from the doublewrite buffer here, including for the first page of any tablespace. buf_page_is_corrupted(): Distinguish the return values CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER. buf_page_check_corrupt(): Return the error code DB_CORRUPTION in case the LSN is in the future. Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff in the same way on both 32-bit and 64-bit architectures. Datafile::read_first_page_flags(): Split from read_first_page(). Take a copy of the first page as a parameter. recv_sys_t::free_corrupted_page(): Take the file as a parameter and return whether a message was displayed. This avoids some duplicated and incomplete error messages. buf_page_t::read_complete(): Remove some redundant output and always display the name of the corrupted file. Never return DB_FAIL; use it only in internal error handling. IORequest::read_complete(): Assume that buf_page_t::read_complete() will have reported any error. fil_space_t::set_corrupted(): Return whether this is the first time the tablespace had been flagged as corrupted. Datafile::validate_first_page(), fil_node_open_file_low(), fil_node_open_file(), fil_space_t::read_page0(), fil_node_t::read_page0(): Add a parameter for a copy of the first page, and a parameter to indicate whether the FIL_PAGE_LSN check should be suppressed. Before buf_dblwr_t::recover() is invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the FSP_SPACE_FLAGS and the tablespace ID that may be present in a potentially too new copy of a page. Reviewed by: Debarun Banerjee	2024-10-17 17:24:20 +03:00
Libing Song	72cc58bb71	MDEV-32014 Rename binlog cache temporary file to binlog file for large transaction Description =========== When a transaction commits, it copies the binlog events from binlog cache to binlog file. Very large transactions (eg. gigabytes) can stall other transactions for a long time because the data is copied while holding LOCK_log, which blocks other commits from binlogging. The solution in this patch is to rename the binlog cache file to a binlog file instead of copy, if the commiting transaction has large binlog cache. Rename is a very fast operation, it doesn't block other transactions a long time. Design ====== * binlog_large_commit_threshold type: ulonglong scope: global dynamic: yes default: 128MB Only the binlog cache temporary files large than 128MB are renamed to binlog file. * #binlog_cache_files directory To support rename, all binlog cache temporary files are managed as normal files now. `#binlog_cache_files` directory is in the same directory with binlog files. It is created at server startup if it doesn't exist. Otherwise, all files in the directory is deleted at startup. The temporary files are named with ML_ prefix and the memorary address of the binlog_cache_data object which guarantees it is unique. * Reserve space To supprot rename feature, It must reserve enough space at the begin of the binlog cache file. The space is required for Format description, Gtid list, checkpoint and Gtid events when renaming it to a binlog file. Since binlog_cache_data's cache_log is directly accessed by binlog log, online alter and wsrep. It is not easy to update all the code. Thus binlog cache will not reserve space if it is not session binlog cache or wsrep session is enabled. - m_file_reserved_bytes Stores the bytes reserved at the begin of the cache file. It is initialized in write_prepare() and cleared by reset(). The reserved file header is hide to callers. Thus there is no change for callers. E.g. - get_byte_position() still get the length of binlog data written to the cache, but not the file length. - truncate(0) will truncate the file to m_file_reserved_bytes but not 0. - write_prepare() write_prepare() is called everytime when anything is being written into the cache. It will call init_file_reserved_bytes() to create the cache file (if it doesn't exist) and reserve suitable space if the data written exceeds buffer's size. * Binlog_commit_by_rotate It is used to encapsulate the code for remaing a binlog cache tempoary file to binlog file. - should_commit_by_rotate() it is called by write_transaction_to_binlog_events() to check if a binlog cache should be rename to a binlog file. - commit() That is the entry to rename a binlog cache and commit the transaction. Both rename and commit are protected by LOCK_log, Thus not other transactions can write anything into the renamed binlog before it. Rename happens in a rotation. After the new binlog file is generated, replace_binlog_file() is called to: - copy data from the new binlog file to its binlog cache file. - write gtid event. - rename the binlog cache file to binlog file. After that the rotation will continue to succeed. Then the transaction is committed in a seperated group itself. Its cache file will be detached and cache log will be reset before calling trx_group_commit_with_engines(). Thus only Xid event be written.	2024-10-17 07:53:59 -06:00
Yuchen Pei	ba7088d462	Merge '11.4' into 11.6	2024-10-03 15:59:20 +10:00
Marko Mäkelä	7ea9e1358f	Merge 11.2 into 11.4	2024-09-18 08:07:22 +03:00
Marko Mäkelä	e782e416ac	Merge 10.11 into 11.2	2024-09-18 07:38:49 +03:00
Marko Mäkelä	b187414764	Merge 10.6 into 10.11	2024-09-16 10:58:40 +03:00
Marko Mäkelä	a74bea7ba9	MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables buf_page_t::read_complete(): Fix an incorrect condition that had been added in commit `aaef2e1d8c` (MDEV-27058). Also for compressed-only pages we must remember that buffered changes may exist. buf_read_page(): Correct the function comment; this is for a synchronous and not asynchronous read. Pass the parameter unzip=true to buf_read_page_low(), because each of our callers will be interested in the uncompressed page frame. This will cause the test encryption.innodb-compressed-blob to emit more errors when the correct keys for decrypting the clustered index root page are unavailable. Reviewed by: Debarun Banerjee	2024-09-12 10:52:55 +03:00
Oleksandr Byelkin	d6444022ca	Merge branch 'bb-11.5-release' into bb-11.6-release	2024-08-06 17:28:38 +02:00
Oleksandr Byelkin	ea75a0b600	Merge branch '11.4' into 11.5	2024-08-05 17:50:18 +02:00
Oleksandr Byelkin	1640c9b06e	Merge branch '11.2' into 11.4	2024-08-04 17:27:48 +02:00
Oleksandr Byelkin	dced6cbdb6	Merge branch '11.1' into 11.2	2024-08-03 09:50:16 +02:00
Oleksandr Byelkin	80abd847da	Merge branch '10.11' into 11.1	2024-08-03 09:32:42 +02:00
Oleksandr Byelkin	0e8fb977b0	Merge branch '10.6' into 10.11	2024-08-03 09:15:40 +02:00
Oleksandr Byelkin	8f020508c8	Merge branch '10.5' into 10.6	2024-08-03 09:04:24 +02:00
Thirunarayanan Balathandayuthapani	533e6d5d13	MDEV-34670 IMPORT TABLESPACE unnecessary traverses tablespace list Problem: ======== - After the commit `ada1074bb1` (MDEV-14398) fil_crypt_set_encrypt_tables() iterates through all tablespaces to fill the default_encrypt tables list. This was a trigger to encrypt or decrypt when key rotation age is set to 0. But import tablespace does call fil_crypt_set_encrypt_tables() unnecessarily. The motivation for the call is to signal the encryption threads. Fix: ==== ha_innobase::discard_or_import_tablespace: Remove the fil_crypt_set_encrypt_tables() and add the import tablespace to the default encrypt list if necessary	2024-07-31 14:13:38 +05:30
Daniel Black	0939bfc093	MDEV-19052 main.win postfix --view-protocol compat Correct compatibility with view-protocol. Thanks Lena Startseva	2024-07-27 14:11:03 +10:00
Daniel Black	7788593547	MDEV-19052 Range-type window frame supports only numeric datatype When there is no bounds on the upper or lower part of the window, it doesn't matter if the type is numeric. It also doesn't matter how many ORDER BY items there are in the query. Reviewers: Sergei Petrunia and Oleg Smirnov	2024-07-25 19:16:37 +10:00
Alexander Barkov	36eba98817	MDEV-19123 Change default charset from latin1 to utf8mb4 Changing the default server character set from latin1 to utf8mb4.	2024-07-11 10:21:07 +04:00
Alexander Barkov	4e805aed85	Merge remote-tracking branch 'origin/11.4' into 11.5	2024-07-10 12:17:09 +04:00
Alexander Barkov	5fb07d942b	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-07-09 21:45:37 +04:00
Alexander Barkov	8aad19ddfc	Merge remote-tracking branch 'origin/11.1' into 11.2	2024-07-09 14:04:11 +04:00
Oleksandr Byelkin	2447dda2c0	Merge branch '10.11' into 11.1	2024-07-08 22:40:16 +02:00
Alexander Barkov	8f4ec79d09	Merge remote-tracking branch 'origin/11.4' into 11.5	2024-07-08 12:25:04 +04:00
Oleksandr Byelkin	034a175982	Merge branch '10.6' into 10.11	2024-07-04 11:52:07 +02:00
Oleksandr Byelkin	dcd8a64892	Merge branch '10.5' into 10.6	2024-07-03 13:27:23 +02:00
Lena Startseva	9e74a7f4f3	Removing MDEV-27871 from tastcases because it is not a bug	2024-06-28 16:45:50 +07:00
Alexander Barkov	c4bf4ce948	Merge remote-tracking branch 'origin/11.2' into 11.4	2024-06-17 15:46:39 +04:00
Marko Mäkelä	a21e49cbcc	Merge 11.1 into 11.2	2024-06-17 12:02:03 +03:00
Marko Mäkelä	d34289a3e2	Merge 10.11 into 11.1	2024-06-17 09:21:50 +03:00
Marko Mäkelä	b81d717387	Merge 10.6 into 10.11	2024-06-11 12:50:10 +03:00
Marko Mäkelä	a687cf8661	Merge 10.5 into 10.6	2024-06-07 10:03:51 +03:00
Igor Babaev	4d38267fc7	MDEV-29307 Wrong result when joining two derived tables over the same view This bug could affect queries containing a join of derived tables over grouping views such that one of the derived tables contains a window function while another uses view V with dependent subquery DSQ containing a set function aggregated outside of the subquery in the view V. The subquery also refers to the fields from the group clause of the view.Due to this bug execution of such queries could produce wrong result sets. When the fix_fields() method performs context analysis of a set function AF first, at the very beginning the function Item_sum::init_sum_func_check() is called. The function copies the pointer to the embedding set function, if any, stored in THD::LEX::in_sum_func into the corresponding field of the set function AF simultaneously changing the value of THD::LEX::in_sum_func to point to AF. When at the very end of the fix_fields() method the function Item_sum::check_sum_func() is called it is supposed to restore the value of THD::LEX::in_sum_func to point to the embedding set function. And in fact Item_sum::check_sum_func() did it, but only for regular set functions, not for those used in window functions. As a result after the context analysis of AF had finished THD::LEX::in_sum_func still pointed to AF. It confused the further context analysis. In particular it led to wrong resolution of Item_outer_ref objects in the fix_inner_refs() function. This wrong resolution forced reading the values of grouping fields referred in DSQ not from the temporary table used for aggregation from which they were supposed to be read, but from the table used as the source table for aggregation. This patch guarantees that the value of THD::LEX::in_sum_func is properly restored after the call of fix_fields() for any set function.	2024-06-04 17:54:01 -07:00
Oleksandr Byelkin	dd7d9d7fb1	Merge branch '11.4' into 11.5	2024-05-23 17:01:43 +02:00
Oleksandr Byelkin	99b370e023	Merge branch '11.2' into 11.4	2024-05-21 19:38:51 +02:00
Sergei Petrunia	0940a96940	MDEV-18478 ANALYZE for statement should show selectivity of ICP, part#2 Part#2, variant 2: Make the printed r_ values in JSON output consistent. After this patch, ANALYZE output has: - r_index_rows (NEW) - Observed number of rows before ICP or Rowid Filtering checks. This is a per-scan average. like r_rows and "rows" are. - r_rows (AS BEFORE) - Observed number of rows after ICP and Rowid Filtering. - r_icp_filtered (NEW) - Observed selectivity of ICP condition. - (AS BEFORE) observed selectivity of Rowid Filter is in $.rowid_filter.r_selectivity_pct - r_total_filtered - Observed combined selectivity: fraction of rows left after applying ICP condition, Rowid Filter, and attached_condition. This is now comparable with "filtered" and is printed right after it. - r_filtered (AS BEFORE) - Observed selectivity of "attached_condition". Tabular ANALYZE output is not changed. Note that JSON's r_filtered and r_rows have the same meanings as before and have the same meaning as in tabular output.	2024-04-23 22:55:22 +03:00
Oleksandr Byelkin	cd28b2479c	Merge branch '11.1' into 11.2	2024-04-09 12:12:33 +02:00
Marko Mäkelä	683fbced6b	Merge 11.0 into 11.1	2024-03-28 12:15:36 +02:00
Marko Mäkelä	d73baa402a	Merge 10.11 into 11.0	2024-02-20 12:02:01 +02:00
Marko Mäkelä	86c2c89743	Merge 10.6 into 10.11	2024-02-08 15:04:46 +02:00

1 2 3 4 5 ...

626 commits