Commit graph

80 commits

Author SHA1 Message Date
Marko Mäkelä
1fd0099839 Merge 10.10 into 10.11 2023-02-16 11:41:18 +02:00
Marko Mäkelä
dbab3e8d90 Merge 10.6 into 10.8 2023-02-10 13:43:53 +02:00
Marko Mäkelä
6aec87544c Merge 10.5 into 10.6 2023-02-10 13:03:01 +02:00
Oleksandr Byelkin
c7c415734d Merge branch '10.10' into 10.11 2023-01-31 11:07:08 +01:00
Oleksandr Byelkin
b923b80cfd Merge branch '10.6' into 10.7 2023-01-31 09:33:58 +01:00
Khem Raj
75bbf645a6 Add missing include <cstdio>
This is needed with GCC 13 and newer [1]

[1] https://www.gnu.org/software/gcc/gcc-13/porting_to.html

Signed-off-by: Khem Raj <raj.khem@gmail.com>
2023-01-27 12:43:38 +11:00
Heiko Becker
15226a2822 Add missing include for std::runtime_error
Fixes the following error when building with gcc 13:

"tpool/aio_liburing.cc:64:18: error: 'runtime_error' is not a member of 'std'
   64 |       throw std::runtime_error("aio_uring()");"
2023-01-25 17:30:18 +11:00
Marko Mäkelä
3ec4241b00 Merge 10.10 into 10.11 2022-09-07 10:14:41 +03:00
Marko Mäkelä
0c0b697ae3 Merge 10.6 into 10.7 2022-09-07 08:56:06 +03:00
Daniel Black
fd8dbe0d2c MDEV-29443: prevent uring access to galera sst /notify scripts
The resources like uring in MariaDB aren't intended for spawned
processes so we restrict access using the io_uring_ring_dontfork
liburing library call.
2022-09-06 08:03:49 +10:00
Marko Mäkelä
fe1f8f2c6b Merge 10.10 into 10.11 2022-08-30 13:36:30 +03:00
Marko Mäkelä
b86be02ecf Merge 10.6 into 10.7 2022-08-30 13:02:42 +03:00
Marko Mäkelä
76bb671e42 Merge 10.5 into 10.6 2022-08-25 16:02:44 +03:00
Vladislav Vaintroub
a3fd9e6b06 MDEV-29367 Refactor tpool::cache
Removed use std::vector's ba push_back(), pop_back() to  make it more
obvious that memory in the vectors won't be reallocated.

Also, "borrowed" elements can be debugged a little better now,
they are put into the start of the m_cache vector.
2022-08-24 13:36:49 +02:00
Vladislav Vaintroub
c8e3bcf79b MDEV-11026 Make InnoDB number of IO write/read threads dynamic
Fix concurrency error  - avoid accessing deleted memory, when io_slots is
resized. the deleted memory in this case was vftable pointer in
aiocb::m_internal_task

The fix avoids calling dummy release function, via a flag in task_group.
2022-06-27 12:00:31 +02:00
Vladislav Vaintroub
49e660bb12 MDEV-11026 Make InnoDB number of IO write/read threads dynamic
Resize the read/write slots, and recreate the io_context (for Linux libaio)
2022-06-27 11:59:20 +02:00
Marko Mäkelä
5d0496c749 Merge 10.6 into 10.7 2022-06-23 13:20:25 +03:00
Vladislav Vaintroub
eb7f46ca1e Merge remote-tracking branch 'origin/10.5' into 10.6 2022-06-23 06:29:57 +02:00
Vladislav Vaintroub
35f2cdcb99 MDEV-28920 Rescheduling of innodb_stats_func() missing
Fixed tpool timer implementation on POSIX.
Prior to this patch, under some specific rare circumstances (concurrency
related), timer callback execution might be skipped.
2022-06-23 05:53:55 +02:00
Marko Mäkelä
6680fd8d4b Merge 10.6 into 10.7 2022-06-21 18:02:41 +03:00
Marko Mäkelä
3794673111 MDEV-28836: Memory alignment cleanup
Table_cache_instance: Define the structure aligned at
the CPU cache line, and remove a pad[] data member.
Krunal Bauskar reported this to improve performance on ARMv8.

aligned_malloc(): Wrapper for the Microsoft _aligned_malloc()
and the ISO/IEC 9899:2011 <stdlib.h> aligned_alloc().
Note: The parameters are in the Microsoft order (size, alignment),
opposite of aligned_alloc(alignment, size).
Note: The standard defines that size must be an integer multiple
of alignment. It is enforced by AddressSanitizer but not by GNU libc
on Linux.

aligned_free(): Wrapper for the Microsoft _aligned_free() and
the standard free().

HAVE_ALIGNED_ALLOC: A new test. Unfortunately, support for
aligned_alloc() may still be missing on some platforms.
We will fall back to posix_memalign() for those cases.

HAVE_MEMALIGN: Remove, along with any use of the nonstandard memalign().

PFS_ALIGNEMENT (sic): Removed; we will use CPU_LEVEL1_DCACHE_LINESIZE.

PFS_ALIGNED: Defined using the C++11 keyword alignas.

buf_pool_t::page_hash_table::create(),
lock_sys_t::hash_table::create():
lock_sys_t::hash_table::resize(): Pad the allocation size to an
integer multiple of the alignment.

Reviewed by: Vladislav Vaintroub
2022-06-21 16:59:49 +03:00
Marko Mäkelä
712b443a3c Merge 10.6 into 10.7 2022-06-02 07:48:30 +03:00
Marko Mäkelä
db0fde3f24 MDEV-28665 aio_uring::thread_routine terminates prematurely, causing hang
aio_uring::thread_routine(): Handle -EINTR from io_uring_wait_cqe()
in the same way as aio_linux::getevent_thread_routine() does it:
simply ignore it and invoke the system call again.

Reviewed by: Vladislav Vaintroub
2022-05-25 13:18:24 +03:00
Sergei Golubchik
fd132be117 Merge branch '10.6' into 10.7 2022-05-11 11:25:33 +02:00
Daniel Black
6350a52445 tpool: liburing typo in error
Also the ENOSYS is more likely explained by seccomp
filters in containers than a pre-5.1 kernel, so include
both.
2022-04-27 09:56:28 +10:00
Marko Mäkelä
c235295525 Merge 10.6 into 10.7 2022-04-14 13:31:07 +03:00
Marko Mäkelä
2aed566d22 Cleanup: alignas(CPU_LEVEL1_DCACHE_LINESIZE)
Let us replace all use of MY_ALIGNED in InnoDB with C++11 alignas.

CACHE_LINE_SIZE: Replaced with CPU_LEVEL1_DCACHE_LINESIZE.
2022-04-14 10:40:26 +03:00
Marko Mäkelä
a4d753758f Merge 10.6 into 10.7 2022-03-30 08:52:05 +03:00
Sergei Golubchik
f92388fa14 MDEV-27900 fixes
* prevent infinite recursion in beyond-EOF reads (when pread returns 0)
* reduce code duplication

followup for d78173828e and f4fb6cb3fe
2022-03-25 20:33:42 +01:00
Marko Mäkelä
e67d46e4a1 Merge 10.6 into 10.7 2022-03-14 11:30:32 +02:00
Daniel Black
f4fb6cb3fe MDEV-27900: aio handle partial reads/writes (uring)
MDEV-27900 continued for uring.

Also spell synchronously correctly in sql_parse.cc.

Reviewed by Wlad.
2022-03-12 16:16:47 +11:00
Daniel Black
bd1ba7801f Merge branch 10.5 into 10.6 2022-03-12 16:16:03 +11:00
Daniel Black
d78173828e MDEV-27900: aio handle partial reads/writes
As btrfs showed, a partial read of data in AIO /O_DIRECT circumstances can
really confuse MariaDB.

Filipe Manana (SuSE)[1] showed how database programmers can assume
O_DIRECT is all or nothing.

While a fix was done in the kernel side, we can do better in our code by
requesting that the rest of the block be read/written synchronously if
we do only get a partial read/write.

Per the APIs, a partial read/write can occur before an error, so
reattempting the request will leave the caller with a concrete error to
handle.

[1] https://lore.kernel.org/linux-btrfs/CABVffENfbsC6HjGbskRZGR2NvxbnQi17gAuW65eOM+QRzsr8Bg@mail.gmail.com/T/#mb2738e675e48e0e0778a2e8d1537dec5ec0d3d3a

Also spell synchronously correctly in other files.
2022-03-12 09:47:53 +11:00
Sergei Golubchik
65f602310c Merge branch '10.6' into 10.7 2022-02-10 21:16:50 +01:00
Sergei Golubchik
e3894f5d39 Merge branch '10.5 into 10.6 2022-02-10 21:07:03 +01:00
Vladislav Vaintroub
012e724deb MDEV-27796 Windows - starting server with huge innodb-log-buffer-size may fail
Fixed tpool::pread() and tpool::pwrite() to return SSIZE_T on Windows,
so that huge numbers are not converted to negatives.

Also, make sure to never attempt reading/writing more bytes than
DWORD can accomodate (4G)
2022-02-10 17:25:12 +01:00
Marko Mäkelä
7e8a13d9d7 Merge 10.6 into 10.7 2021-11-19 17:45:52 +02:00
Marko Mäkelä
db915f7387 MDEV-27058: Move buf_page_t::slot to IORequest::slot
MDEV-23855 and MDEV-23399 already moved some transient data fields
from buffer pool page descriptors to IORequest, but the write buffer
of PAGE_COMPRESSED or ENCRYPTED tables was missed. Since is only needed
during asynchronous page write requests, it belongs to IORequest.
2021-11-18 17:44:33 +02:00
Kartik Soneji
c356714d77 Change Find*.cmake modules to match conventions 2021-10-27 15:55:14 +02:00
Marko Mäkelä
03e4cb2484 MDEV-24512 fixup: Remove after_task_callback
In commit ff5d306e29 we removed
dbug_after_task_callback but forgot to revert the rest of
commit bada05a883.
2021-09-14 16:23:23 +03:00
Marko Mäkelä
b630f0b1b9 Merge 10.5 into 10.6 2021-06-18 09:16:20 +03:00
Vladislav Vaintroub
78bd7d86a4 MDEV-25953 Tpool - prevent potential deadlock in simulated AIO
Do not execute user callback just after pwrite. Instead, submit user
function as task into thread pool. This way, the IO thread would not hog
aiocb, which is a limited (in Innodb) resource
2021-06-17 20:39:47 +02:00
Marko Mäkelä
6ba938af62 MDEV-25905: Assertion table2==NULL in dict_sys_t::add()
In commit 49e2c8f0a6 (MDEV-25743)
we made dict_sys_t::find() incompatible with the rest of the
table name hash table operations in case the table name contains
non-ASCII octets (using a compatibility mode that facilitates the
upgrade into the MySQL 5.0 filename-safe encoding) and the target
platform implements signed char.

ut_fold_string(): Remove; replace with my_crc32c(). This also makes
table name hash value calculations independent on whether char
is unsigned or signed.
2021-06-14 12:38:56 +03:00
Marko Mäkelä
28e362eaca MDEV-25760: Resubmit IO job on -EAGAIN from io_uring
The server still may abort if there is no enough free space in the
ring buffer to resubmit the IO job, but the behavior is equal to
the failure of os_aio() -> submit_io().
2021-06-14 12:38:56 +03:00
Daniel Black
f82e69735e io_liburing: ENOMEM handling - use io_uring_mlock_size
This gives the user the size required and how to set
memlock limits for the process.

Thanks Jens Axboe for providing this requested interface

ref: https://github.com/axboe/liburing/issues/246

Also don't put \n on my_printf_error, its implicit.
2021-04-15 07:42:13 +10:00
Vladislav Vaintroub
cb545f1116 CMake cleanup
- use FIND_PACKAGE(LIBAIO) to find libaio
- Use standard CMake conventions in Find{PMEM,URING}.cmake
- Drop the LIB from LIB{PMEM,URING}_{INCLUDE_DIR,LIBRARIES}
  It is cleaner, and consistent with how other packages are handled in CMake.
  e.g successful FIND_PACKAGE(PMEM) now sets PMEM_FOUND, PMEM_LIBRARIES,
  PMEM_INCLUDE_DIR, not  LIBPMEM_{FOUND,LIBRARIES,INCLUDE_DIR}.
- Decrease the output. use FIND_PACKAGE with QUIET argument.
- for Linux packages, either liburing, or libaio is required
  If liburing is installed, libaio does not need to be present   .
  Use FIND_PACKAGE([LIBAIO|URING] REQUIRED) if either library is required.
2021-03-23 17:20:17 +01:00
Marko Mäkelä
e8113f7572 CMake cleanup: Make WITH_URING, WITH_PMEM Boolean
The new default values WITH_URING:BOOL=OFF, WITH_PMEM:BOOL=OFF imply
that the dependencies are optional.
An explicit request WITH_URING=ON or WITH_PMEM=ON will cause the
build to fail if the requested dependencies are not available.

Last, to prevent a feature to be built in even though the built-time
dependencies are available, the following can be used:

cmake -DCMAKE_DISABLE_FIND_PACKAGE_URING=1
cmake -DCMAKE_DISABLE_FIND_PACKAGE_PMEM=1

This cleanup was suggested by Vladislav Vaintroub.
2021-03-20 16:23:47 +02:00
Marko Mäkelä
a0558b8c96 MDEV-24883 fixup: Add a dependency
In commit 783625d78f we forgot to
declare a dependency on the generated file mysqld_error.h.
2021-03-15 12:11:41 +02:00
Marko Mäkelä
783625d78f MDEV-24883 add io_uring support for tpool
liburing is a new optional dependency (WITH_URING=auto|yes|no)
that replaces libaio when it is available.

aio_uring: class which wraps io_uring stuff

aio_uring::bind()/unbind(): optional optimization

aio_uring::submit_io(): mutex prevents data race. liburing calls are
thread-unsafe. But if you look into it's implementation you'll see
atomic operations. They're used for synchronization between kernel and
user-space only. That's why our own synchronization is still needed.

For systemd, we add LimitMEMLOCK=524288 (ulimit -l 524288)
because the io_uring_setup system call that is invoked
by io_uring_queue_init() requests locked memory. The value
was found empirically; with 262144, we would occasionally
fail to enable io_uring when using the maximum values of
innodb_read_io_threads=64 and innodb_write_io_threads=64.

aio_uring::thread_routine(): Tolerate -EINTR return from
io_uring_wait_cqe(), because it may occur on shutdown
on Ubuntu 20.10 (Groovy Gorilla).

This was mostly implemented by Eugene Kosov. Systemd integration
and improved startup/shutdown error handling by Marko Mäkelä.
2021-03-15 11:30:17 +02:00
Marko Mäkelä
f24b738318 MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1
In commit 5e62b6a5e0 (MDEV-16264)
the logic of os_aio_init() was changed so that it will never fail,
but instead automatically disable innodb_use_native_aio (which is
enabled by default) if the io_setup() system call would fail due
to resource limits being exceeded. This is questionable, especially
because falling back to simulated AIO may lead to significantly
reduced performance.

srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads:
Change the data type from ulong to uint.

os_aio_init(): Remove the parameters, and actually return an error code.

thread_pool::configure_aio(): Do not silently fall back to simulated AIO.

Reviewed by: Vladislav Vaintroub
2020-12-14 15:27:03 +02:00