Commit graph

27839 commits

Author SHA1 Message Date
Marko Mäkelä
8a6a4c947a Cleanup: Replace some is_predefined_tablespace()
In some places, there were redundant comparisons against TRX_SYS_SPACE
or SRV_TMP_SPACE_ID. The temporary tablespace is never the subject of
log-based recovery.

Also, consistently check for SRV_SPACE_ID_UPPER_BOUND.

Reviewed by: Debarun Barerjee
2024-10-04 13:41:12 +03:00
Marko Mäkelä
b249a059da MDEV-34850: Busy work while parsing FILE_ records
In mariadb-backup --backup, we only have to invoke the undo_space_trunc
and log_file_op callbacks as well as validate the mini-transaction
checksums. There is absolutely no need to access recv_sys.pages or
recv_spaces, or to allocate a decrypt_buf in case of innodb_encrypt_log=ON.
This is what the new mode recv_sys_t::store::BACKUP will do.

In the skip_the_rest: loop, the main thing is to process all FILE_ records
until the end of the log is reached.  Additionally, we must process
INIT_PAGE and FREE_PAGE records in the same way as they would be
during storing == YES.

This was measured to reduce the CPU time between the messages
"InnoDB: Multi-batch recovery needed at LSN" and
"InnoDB: End of log at LSN"
by some 20%.

recv_sys_t::store: A ternary enumeration that specifies how records
should be stored: NO, BACKUP, or YES.

recv_sys_t::parse(), recv_sys_t::parse_mtr(), recv_sys_t::parse_pmem():
Replace template<bool store> with template<store storing>.

store_freed_or_init_rec(): Simplify some logic. We can look up also
the system tablespace.

Reviewed by: Debarun Banerjee
2024-10-04 13:38:21 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Marko Mäkelä
23debf214f MDEV-28091 fixup: Fix another pfs_malloc() stub
In commit 0f56e21efa
only one pfs_malloc() stub was fixed to return aligned memory.
Also, the MSVC _aligned_malloc() pairs with _aligned_free().
2024-10-03 08:15:17 +03:00
Daniel Black
1f7898f686 mroonga: remove -Wunused-but-set-variable warnings
There where unused variable. They were not conditional
on defines, so removed them.

Added an error handing in proc_object if there was no db
as subsequent operations would have failed.
2024-10-03 15:05:09 +10:00
Daniel Black
3723fd1573 MDEV-35007 mroonga should modify source files during build
CMake rewriting the tests causes Mroonga to be un-buildable
on build environments where there source directory is read
only.

In the test results, the version wasn't particularly important.

Remove the version dependence of tests.
2024-10-03 15:05:09 +10:00
Marko Mäkelä
cc70ca7eab MDEV-35059 ALTER TABLE...IMPORT TABLESPACE with FULLTEXT SEARCH may corrupt the adaptive hash index
build_fts_hidden_table(): Correct a mistake that had been made in
commit 903ae30069 (MDEV-30655).
2024-10-02 11:09:31 +03:00
Sergei Golubchik
b1bbdbab9e cleanup: remove redundant if()
likely, a result of auto-merge of two fixes in different versions
2024-10-01 18:29:11 +02:00
Sergei Golubchik
813e592763 compilation failure in CONNECT
storage/connect/tabfmt.cpp:419:24: error: '%.3d' directive writing between 3 and 10 bytes into a region of size 5 [-Werror=format-overflow=]
  419 |       sprintf(buf, "COL%.3d", i+1);
2024-10-01 18:29:11 +02:00
Marko Mäkelä
464055fe65 MDEV-34078 Memory leak in InnoDB purge with 32-column PRIMARY KEY
row_purge_reset_trx_id(): Reserve large enough offsets for accomodating
the maximum width PRIMARY KEY followed by DB_TRX_ID,DB_ROLL_PTR.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-10-01 18:35:39 +03:00
Marko Mäkelä
a298dfb84c MDEV-35053 Crash in purge_sys_t::iterator::free_history_rseg()
purge_sys_t::get_page(): Avoid accessing a freed reference to pages[id]
after pages.erase(id).  This heap-use-after-free would sometimes be
caught by AddressSanitizer.

purge_sys_t::iterator::free_history_rseg(): Do not crash if undo=nullptr
(the database is corrupted).

Reviewed by: Debarun Banerjee
2024-10-01 15:03:04 +03:00
Marko Mäkelä
2d031f4a71 MDEV-34973 fixup for POWER,s390x
xtest(): Correct the declaration.
2024-10-01 13:29:59 +03:00
Max Kellermann
6715e4dfe1 MDEV-34973: innobase/dict0dict: add noexcept to lock/unlock methods
Another chance for cutting back overhead due to C++ exceptions being
enabled; the `dict_sys_t` class is a good candidate because its
locking methods are called frequently.

Binary size reduction this time:

    text	  data	   bss	   dec	   hex	filename
 24448622	2436488	9473537	36358647	22ac9f7	build/release/sql/mariadbd
 24448474	2436488	9473601	36358563	22ac9a3	build/release/sql/mariadbd
2024-10-01 09:53:16 +03:00
Max Kellermann
813123e3e0 MDEV-34973: innobase/lock0lock: add noexcept
MariaDB is compiled with C++ exceptions enabled, and that disallows
some optimizations (e.g. the stack must always be unwinding-safe).  By
adding `noexcept` to functions that are guaranteed to never throw,
some of these optimizations can be regained.  Low-level locking
functions that are called often are a good candidate for this.

This shrinks the executable a bit (tested with GCC 14 on aarch64):

    text	  data	   bss	   dec	   hex	filename
 24448910	2436488	9473185	36358583	22ac9b7	build/release/sql/mariadbd
 24448622	2436488	9473537	36358647	22ac9f7	build/release/sql/mariadbd
2024-10-01 09:53:16 +03:00
Thirunarayanan Balathandayuthapani
cc810e64d4 MDEV-34392 Inplace algorithm violates the foreign key constraint
Don't allow the referencing key column from NULL TO NOT NULL
when

 1) Foreign key constraint type is ON UPDATE SET NULL
 2) Foreign key constraint type is ON DELETE SET NULL
 3) Foreign key constraint type is UPDATE CASCADE and referenced
 column declared as NULL

Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values

get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.

fk_check_column_changes(): Enforce the above rules for COPY
algorithm

innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation

innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm

dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
2024-10-01 09:41:56 +05:30
Max Kellermann
45298b730b sql/handler: referenced_by_foreign_key() returns bool
The method was declared to return an unsigned integer, but it is
really a boolean (and used as such by all callers).

A secondary change is the addition of "const" and "noexcept" to this
method.

In ha_mroonga.cpp, I also added "inline" to the two helper methods of
referenced_by_foreign_key().  This allows the compiler to flatten the
method.
2024-09-30 16:33:25 +03:00
Marko Mäkelä
d28ac3f82d MDEV-34207: ALTER TABLE...STATS_PERSISTENT=0 fails to drop statistics
commit_try_norebuild(): If the STATS_PERSISTENT attribute of the table
is being changed to disabled, drop the persistent statistics of the table.
2024-09-30 15:27:38 +03:00
Sergei Golubchik
b88f1267e4 MDEV-33373 part 2: Unexpected ER_FILE_NOT_FOUND upon reading from logging table after crash recovery
CSV engine shoud set my_errno if use it.
2024-09-30 13:50:51 +02:00
Marko Mäkelä
dd5ce6b0c4 MDEV-34450 os_file_write_func() is an overkill for ib_logfile0
log_file_t::read(), log_file_t::write(): Invoke pread() or pwrite()
directly, so that we can give more accurate diagnostics in case of
a failure, and so that we will avoid the overhead of setting up 5(!)
stack frames and related objects.

tpool::pwrite(): Add a missing const qualifier.
2024-09-30 13:36:38 +03:00
Daniel Black
f199dffe3b MDEV-34937 s3 engine no longer functional on non-gcc builds
Last commit on libmarias3 broke non-gcc builds by excluding
the most important aspect, the snprintf being executed.

Reviewer: Andrew Hutchings <andrew@mariadb.org>
Ref: https://github.com/mariadb-corporation/libmarias3/pull/130
2024-09-30 09:23:30 +01:00
Yuchen Pei
282b92f0a2
MDEV-34589 Do not execute before queries in spider_db_mbase::rollback()
Rollback is not supposed to fail. This prevents false failures in
spider rollback.
2024-09-30 16:16:27 +10:00
Yuchen Pei
42735c557e
MDEV-34636 Spider: reset wide_handler->trx in two occasions
ha_spider::update_create_info()
ha_spider::append_lock_tables_list()
2024-09-30 15:52:08 +10:00
Yuchen Pei
f43ea935a1
MDEV-34636 Remove implementation of ha-spider::extra() with MERGE flags 2024-09-30 15:51:18 +10:00
Yuchen Pei
69874ee95c
MDEV-34828 Remove some obsolete cmake code related to the removed spider handlersocket support
A fixup of MDEV-26858
2024-09-30 15:12:00 +10:00
Marko Mäkelä
2d3ddaef35 MDEV-34907 Bogus assertion failure and busy work while parsing FILE_ records
A server that was running with innodb_log_file_size=96M and
innodb_buffer_pool_size=6M had inserted some data into a table
that was subsequently dropped. When the server was killed and
restarted, an assertion failed in recv_sys_t::parse() while
a FSP_SIZE change was unnecessarily being processed during
the skip_the_rest: loop in recv_scan_log().

The ib_logfile0 contents was as follows:

1. The checkpoint start LSN points to the start of some mini-transaction.
2. There may be log records for modifying files for which a FILE_MODIFY
had been written before the checkpoint. These records were "purged"
by advancing the checkpoint.
3. At some point during the initial parsing with store=true the space
reserved for recv_sys.pages will run out and recv_scan_log() would switch
to the skip_the_rest: mode.
4. We encounter a log record for extending a tablespace that will be
deleted a bit later. This would trip the bogus debug assertion.
5. Later on, there would be a FILE_DELETE record for this tablespace.
6. The checkpoint end LSN points to a possibly empty sequence of
FILE_MODIFY records and a FILE_CHECKPOINT record. Recovery had parsed these
records first, before rewinding to the checkpoint start LSN.
7. There could be further records following the FILE_CHECKPOINT record.
Recovery will process all records until an inconsistency is found and
it is assumed that the end of the circular ib_logfile0 was reached.

recv_sys_t::parse(): For the template instantiation with store=false,
remove a debug assertion that could fail in a multi-batch recovery,
while recv_scan_log(false) would be in the skip_the_rest: loop.
It is very well possible that we have not encountered all FILE_ records
yet, and therefore we should not complain about unknown tablespaces.

Reviewed by: Debarun Banerjee
2024-09-27 12:31:37 +03:00
Marko Mäkelä
6acada713a MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.

We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").

We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.

Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.

The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().

Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.

It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.

::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.

xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
2024-09-26 18:47:12 +03:00
Denis Protivensky
231900e5bb MDEV-34836: TOI on parent table must BF abort SR in progress on a child
Applied SR transaction on the child table was not BF aborted by TOI running
on the parent table for several reasons:

Although SR correctly collected FK-referenced keys to parent, TOI in Galera
disregards common certification index and simply sets itself to depend on
the latest certified write set seqno.

Since this write set was the fragment of SR transaction, TOI was allowed to
run in parallel with SR presuming it would BF abort the latter.

At the same time, DML transactions in the server don't grab MDL locks on
FK-referenced tables, thus parent table wasn't protected by an MDL lock from
SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction.

In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not
enough to trigger MDL conflict in Galera.

InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to
the fact that it was believed MDL locking should always produce conflicts correctly.

The fix brings conflict resolution rules similar to MDL-level checks to InnoDB,
thus accounting for the problematic case.

Apart from that, wsrep_thd_is_SR() is patched to return true only for executing
SR transactions. It should be safe as any other SR state is either the same as
for any single write set (thus making the two logically equivalent), or it reflects
an SR transaction as being aborting or prepared, which is handled separately in
BF-aborting logic, and for regular execution path it should not matter at all.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-24 11:14:01 +02:00
Marko Mäkelä
971cf59579 Merge 10.6 into 10.11 2024-09-24 08:49:20 +03:00
Marko Mäkelä
638c62acac MDEV-34983: Remove x86 asm from InnoDB
Starting with GCC 7 and clang 15, single-bit operations such as
fetch_or(1) & 1 are translated into 80386 instructions such as
LOCK BTS, instead of using the generic translation pattern
of emitting a loop around LOCK CMPXCHG.

Given that the oldest currently supported GNU/Linux distributions
ship GCC 7, and that older versions of GCC are out of support,
let us remove some work-arounds that are not strictly necessary.
If someone compiles the code using an older compiler, it will work
but possibly less efficiently.

srw_mutex_impl::HOLDER: Changed from 1U<<31 to 1 in order to
work around https://github.com/llvm/llvm-project/issues/37322
which is specific to setting the most significant bit.

srw_mutex_impl::WAITER: A multiplier of waiting requests.
This used to be 1, which would now collide with HOLDER.

fil_space_t::set_stopping(): Remove this unused function.

In MSVC we need _interlockedbittestandset() for LOCK BTS.
2024-09-23 12:51:27 +03:00
Daniel Black
ac5cbaff66 Aria - correct type
Aria transaction ids are uint16 rather than uint.

Change the type to be more accurate.
2024-09-21 09:11:02 +10:00
mariadb-DebarunBanerjee
35d477dd1d MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1
The issue is caused by a race between buf_page_create_low getting the
page from buffer pool hash and buf_LRU_free_page evicting it from LRU.

The issue is introduced in 10.6 by MDEV-27058
commit aaef2e1d8c
MDEV-27058: Reduce the size of buf_block_t and buf_page_t

The solution is buffer fix the page before releasing buffer pool mutex
in buf_page_create_low when x_lock_try fails to acquire the page latch.
2024-09-20 20:26:43 +05:30
Marko Mäkelä
9ea7f7129a MDEV-34909 DDL hang during SET GLOBAL innodb_log_file_size on PMEM
log_t::persist(): Add a parameter holding_latch to specify
whether the caller is already holding exclusive log_sys.latch,
like log_write_and_flush() always is.
2024-09-20 15:29:56 +03:00
Lena Startseva
0a5e4a0191 MDEV-31005: Make working cursor-protocol
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"

Fix for v.10.5
2024-09-18 18:39:26 +07:00
Julius Goryavsky
f176248d4b Merge branch '10.6' into '10.11' 2024-09-17 06:23:10 +02:00
Julius Goryavsky
80fff4c6b1 Merge branch '10.5' into '10.6' 2024-09-16 16:39:59 +02:00
Marko Mäkelä
b187414764 Merge 10.6 into 10.11 2024-09-16 10:58:40 +03:00
Marko Mäkelä
4010dff058 mtr_t::log_file_op(): Fix -Wnonnull
GCC 12.2.0 could issue -Wnonnull for an unreachable call to
strlen(new_path).  Let us prevent that by replacing the condition
(type == FILE_RENAME) with the equivalent (new_path).
This should also optimize the generated code, because the life time
of the parameter "type" will be reduced.
2024-09-14 11:05:44 +03:00
Marko Mäkelä
e3f653ca66 MDEV-34750 fixup: -Wconversion on 32-bit
log_t::resize_write_buf(): If d<0 and d>-length, d will fit in ssize_t,
which is a signed 32-bit or 64-bit integer. Cast from int64_t to ssize_t
to make this clear and to silence a compiler warning.
2024-09-14 10:35:28 +03:00
Marko Mäkelä
a74bea7ba9 MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables
buf_page_t::read_complete(): Fix an incorrect condition that had been
added in commit aaef2e1d8c (MDEV-27058).
Also for compressed-only pages we must remember that buffered changes
may exist.

buf_read_page(): Correct the function comment; this is for a synchronous
and not asynchronous read. Pass the parameter unzip=true to
buf_read_page_low(), because each of our callers will be interested in
the uncompressed page frame. This will cause the test
encryption.innodb-compressed-blob to emit more errors when the
correct keys for decrypting the clustered index root page are unavailable.

Reviewed by: Debarun Banerjee
2024-09-12 10:52:55 +03:00
Marko Mäkelä
f168050e90 MDEV-34791 fixup: Avoid an infinite loop with ROW_FORMAT=COMPRESSED
buf_pool_t::page_fix(): If a change buffer merge may be needed on a
ROW_FORMAT=COMPRESSED page that exists in compressed-only format in
the buffer pool, go ahead to decompress the block. This fixes an
infinite loop.

Reviewed by: Debarun Banerjee
2024-09-12 10:52:12 +03:00
Yuchen Pei
a8c5717223
Merge branch '10.6' into 10.11 2024-09-12 10:44:13 +10:00
Yuchen Pei
09b1269e4a
Merge branch '10.5' into 10.6 2024-09-12 10:17:51 +10:00
Monty
3ae4ecbfc5 MDEV-34867 engine S3 cause 500 error for huawei buckets
Add support for removing the Content-Type header to the S3 engine. This
is required for compatibility with some S3 providers.

This also adds a provider option to the S3 engine which will turn on
relevant compatibility options for specific providers.

This was required for getting MariaDB S3 engine to work with "Huawei
Cloud S3".
To get Huawei S3 storage to work on has set one of the following
S3 options:
s3_provider=Huawei
s3_ssl_no_verify=1

Author: Andrew Hutchings <andrew@mariadb.org>
2024-09-11 16:15:37 +03:00
Yuchen Pei
b168859d1e
Merge branch '10.6' into 10.11 2024-09-11 16:10:53 +10:00
Yuchen Pei
4a09e74387
Merge branch '10.5' into 10.6 2024-09-11 15:49:16 +10:00
Yuchen Pei
cc0faa1e3e
MDEV-31788 Factor functions to reduce duplication around spider_check_and_init_casual_read in ha_spider.cc
factored out static functions:
- spider_prep_loop
- spider_start_bg
- spider_send_queries
2024-09-10 11:52:26 +10:00
Yuchen Pei
0ba97e4dc6
MDEV-31788 Factor out calls to spider_ping_table_mon_from_table in ha_spider.cc 2024-09-10 11:52:26 +10:00
Yuchen Pei
9e1579788f
MDEV-31788 Factor spider locking and unlocking code around sending queries 2024-09-10 11:52:22 +10:00
Yuchen Pei
84067291b4
MDEV-28360 Spider: remove #ifdef SPIDER_use_LEX_CSTRING_for_KEY_Field_name 2024-09-10 11:19:19 +10:00