Commit graph

140 commits

Author SHA1 Message Date
Marko Mäkelä
4dd6131711 MDEV-35049: Privatize ut_fold_ulint_pair()
Now that ut_fold_ulint_pair() and ut_fold_binary() are no longer needed
for anything else than compatibility with old InnoDB data files that may
use innodb_checksum_algorithm=innodb, let us move the code to a single
compilation unit.

Reviewed by: Vladislav Lesin
2025-01-10 16:40:22 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Thirunarayanan Balathandayuthapani
9ba18d1aa0 MDEV-35394 Innochecksum misinterprets freed pages
- Innochecksum misinterprets the freed pages as active one.
This leads the user to think there are too many valid
pages exist.

- To avoid this confusion, innochecksum introduced one
more option --skip-freed-pages and -r to avoid the freed
pages while dumping or printing the summary of the tablespace.

- Innochecksum can safely assume the page is freed if
the respective extent doesn't belong to a segment and marked as
freed in XDES_BITMAP in extent descriptor page.

- Innochecksum shouldn't assume that zero-filled page as extent
descriptor page.

Reviewed-by: Marko Mäkelä
2024-11-27 13:00:51 +05:30
Marko Mäkelä
ebefef658e Merge 10.11 into 11.2 2024-10-18 11:32:22 +03:00
Marko Mäkelä
eca552a1a4 MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

In crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-18 10:12:47 +03:00
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Oleksandr Byelkin
51f9d62005 Merge branch '10.11' into 11.0 2023-08-09 07:53:48 +02:00
Oleksandr Byelkin
ced243a099 Merge branch '10.9' into 10.10 2023-08-05 20:34:09 +02:00
Oleksandr Byelkin
34a8e78581 Merge branch '10.6' into 10.9 2023-08-04 08:01:06 +02:00
Oleksandr Byelkin
6bf8483cac Merge branch '10.5' into 10.6 2023-08-01 15:08:52 +02:00
Oleksandr Byelkin
7564be1352 Merge branch '10.4' into 10.5 2023-07-26 16:02:57 +02:00
Marko Mäkelä
12a5fb4b36 MDEV-31641 innochecksum dies with Floating point exception
print_summary(): Skip index_ids for which index.pages is 0.
Tablespaces may contain some freed pages that used to refer to indexes
or tables that were dropped.
2023-07-10 13:46:34 +03:00
Marko Mäkelä
2e431ff7e6 Merge 10.11 into 11.0 2023-02-16 13:34:45 +02:00
Oleksandr Byelkin
76bcea3154 Merge branch '10.9' into 10.10 2023-01-31 11:01:48 +01:00
Oleksandr Byelkin
638625278e Merge branch '10.7' into 10.8 2023-01-31 09:57:52 +01:00
Oleksandr Byelkin
b923b80cfd Merge branch '10.6' into 10.7 2023-01-31 09:33:58 +01:00
Oleksandr Byelkin
c3a5cf2b5b Merge branch '10.5' into 10.6 2023-01-31 09:31:42 +01:00
Oleksandr Byelkin
a977054ee0 Merge branch '10.3' into 10.4 2023-01-28 18:22:55 +01:00
Oleksandr Byelkin
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
Oleksandr Byelkin
dd24fa3063 Merge branch '10.3' into 10.4 2023-01-26 10:34:26 +01:00
Mikhail Chalov
567b681299 Minimize unsafe C functions usage - replace strcat() and strcpy() (and strncat() and strncpy()) with custom safe_strcat() and safe_strcpy() functions
The MariaDB code base uses strcat() and strcpy() in several
places. These are known to have memory safety issues and their usage is
discouraged. Common security scanners like Flawfinder flags them. In MariaDB we
should start using modern and safer variants on these functions.

This is similar to memory issues fixes in 19af1890b5
and 9de9f105b5 but now replace use of strcat()
and strcpy() with safer options strncat() and strncpy().

However, add '\0' forcefully to make sure the result string is correct since
for these two functions it is not guaranteed what new string will be null-terminated.

Example:

    size_t dest_len = sizeof(g->Message);
    strncpy(g->Message, "Null json tree", dest_len); strncat(g->Message, ":",
    sizeof(g->Message) - strlen(g->Message)); size_t wrote_sz = strlen(g->Message);
    size_t cur_len = wrote_sz >= dest_len ? dest_len - 1 : wrote_sz;
    g->Message[cur_len] = '\0';

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services

-- Reviewer and co-author Vicențiu Ciorbaru <vicentiu@mariadb.org>
-- Reviewer additions:
* The initial function implementation was flawed. Replaced with a simpler
  and also correct version.
* Simplified code by making use of snprintf instead of chaining strcat.
* Simplified code by removing dynamic string construction in the first
  place and using static strings if possible. See connect storage engine
  changes.
2023-01-20 15:18:52 +02:00
Sergei Golubchik
eb26bf6e09 unify client/tool version string
it should now always be

/path/to/exe Ver <tool version> Distrib <server version> for <OS> (<ARCH>)

in all tools and clients
2023-01-19 12:39:28 +01:00
Marko Mäkelä
6b9bba41e8 MDEV-28554: Remove innodb_version
INNODB_VERSION_STR: Replaced with PACKAGE_VERSION (non-functional change).

INNODB_VERSION_SHORT: Replaced with direct use of
MYSQL_VERSION_MAJOR << 8 | MYSQL_VERSION_MINOR.

check_version(): Simplify the mariadb-backup version check,
and require the server version to be MariaDB 10.8 or later,
because that is when the InnoDB redo log format was last changed.
2022-06-03 12:20:19 +03:00
Nayuta Yanagisawa
cbf9d8a8d5 Merge 10.7 into 10.8 2022-04-13 17:52:27 +09:00
Marko Mäkelä
aa3a9d1ef5 Merge 10.6 into 10.7 2022-04-12 16:11:29 +03:00
Marko Mäkelä
ca3bbf4c0c Merge 10.5 into 10.6 2022-04-12 09:26:02 +03:00
Daniel Black
4ee00a29e3 MDEV-28250 aix test case failure innodb_zip.innochecksum_3,4k,crc32,innodb
As discovered by tracing, but also presenting in AIX fseeko
documentation, seeking beyond the EOF is acceptable, as you can write
there.

To display the same error in AIX to other implementations that return
errors on seek, we take the EFBIG error code on reading and error the
same way.

An AIX truss of an aspect of the test:

truss extra/innochecksum --page=18446744073709551615 $MYSQLD_DATADIR/test/tab1.ibd

  statx("./mysql-test/var/log/innodb_zip.innochecksum_3-4k,crc32,innodb/mysqld.1/data//test/tab1.ibd", 0x0FFFFFFFFFFFF610, 176, 010) = 0
  kopen("./mysql-test/var/log/innodb_zip.innochecksum_3-4k,crc32,innodb/mysqld.1/data//test/tab1.ibd", O_RDONLY|O_LARGEFILE) = 3
  kfcntl(3, 12, 0x00000001100006C8)               = 0
  kfcntl(3, F_GETFL, 0x00000001100A6CF8)          = 67108864
  kioctl(3, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
  klseek(3, 0, 1, 0x0FFFFFFFFFFFF3F0)             = 0
  kioctl(3, 22528, 0x0000000000000000, 0x0000000000000000) Err#25 ENOTTY
  kread(3, "DEADBEEF\0\0\0\0FFFFFFFF".., 4096)    = 4096
  klseek(3, 0, 1, 0x0FFFFFFFFFFFF450)             = 0
  klseek(3, 17592186040320, 0, 0x0FFFFFFFFFFFF450) = 0
  klseek(3, 0, 1, 0x0FFFFFFFFFFFF3F0)             = 0
  kread(3, "DEADBEEF\0\0\0\0FFFFFFFF".., 4096)    Err#27 EFBIG

An equivalent Linux trace:

ltrace extra/innochecksum --page=18446744073709551615 $MYSQLD_DATADIR/test/tab1.ibd

  stat64(0x7fff10ea2dc3, 0x7fff10ea0670, 88, 0x8026be41)         = 0
  open64("./mysql-test/var/log/innodb_zip."..., 0, 02072403160)  = 3
  fcntl64(3, 6, 0x139f180, 1)                                    = 0
  fgetpos64(0x615000000080, 0x7fff10ea0760, 1, 0)                = 0
  fseeko64(0x615000000080, 0xffffffff000, 0, 5 <unfinished ...>
  pthread_getspecific(0, 0x4d0eb8, 0x7fff10ea0490, 0)            = 0x7f7b2806d000
  <... fseeko64 resumed> )                                       = 0
  fgetpos64(0x615000000080, 0x7fff10ea0760, 1, 1)                = 0
  feof(0x615000000080)                                           = 0
  feof(0x615000000080)                                           = 1
  Error: Unable to seek to necessary offset
2022-04-07 15:17:52 +10:00
Marko Mäkelä
b2baeba415 Merge 10.7 into 10.8 2022-04-06 13:28:25 +03:00
Marko Mäkelä
2d8e38bc94 Merge 10.6 into 10.7 2022-04-06 13:00:09 +03:00
Marko Mäkelä
9d94c60f2b Merge 10.5 into 10.6 2022-04-06 12:08:30 +03:00
Marko Mäkelä
cacb61b6be Merge 10.4 into 10.5 2022-04-06 10:06:39 +03:00
Marko Mäkelä
d6d66c6e90 Merge 10.3 into 10.4 2022-04-06 08:59:09 +03:00
Marko Mäkelä
7c584d8270 Merge 10.2 into 10.3 2022-04-06 08:06:35 +03:00
Marko Mäkelä
5c69e93630 Merge 10.7 into 10.8 2022-03-30 09:34:07 +03:00
Marko Mäkelä
a4d753758f Merge 10.6 into 10.7 2022-03-30 08:52:05 +03:00
Vlad Lesin
33ff18627e MDEV-27835 innochecksum -S crashes for encrypted .ibd tablespace
As main() invokes parse_page() when -S or -D are set, it can be a case
when parse_page() is invoked when -D filename is not set, that is why
any attempt to write to page dump file must be done only if the file
name is set with -D.

The bug is caused by 2ef7a5a13a
(MDEV-13443).
2022-03-29 13:47:37 +03:00
Marko Mäkelä
b2fa874e46 MDEV-28181 The innochecksum -w option was inadvertently removed
In commit 7a4fbb55b0 (MDEV-25105)
the innochecksum option --write (-w) was removed altogether.
It should have been made a Boolean option, so that old data files
may be converted to a format that is compatible with
innodb_checksum_algorithm=strict_crc32 by executing the following:

innochecksum -n -w ibdata* */*.ibd

It would be better to use an older-version innochecksum
for such a conversion, so that page checksums will be validated
before updating the checksum.

It never was possible for innochecksum to convert files to the
innodb_checksum_algorithm=full_crc32 format that is the default
for new InnoDB data files.
2022-03-28 11:35:10 +03:00
Oleksandr Byelkin
4fb2cb1a30 Merge branch '10.7' into 10.8 2022-02-04 14:50:25 +01:00
Oleksandr Byelkin
9ed8deb656 Merge branch '10.6' into 10.7 2022-02-04 14:11:46 +01:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Marko Mäkelä
5d54fd611f Cleanup: Replace ut_crc32c(x,y) with my_crc32c(0,x,y) 2022-01-21 16:13:04 +02:00
Vladislav Vaintroub
47e18af906 MDEV-27494 Rename .ic files to .inl 2022-01-17 16:41:51 +01:00
Marko Mäkelä
ddcb242b3c Merge 10.6 into 10.7 2021-08-04 11:52:39 +03:00
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Oleksandr Byelkin
7841a7eb09 Merge branch '10.3' into 10.4 2021-07-31 22:59:58 +02:00