mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-30 02:30:06 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	738d4604b7	cmake: rename backup component to Backup for consistency	2023-02-12 12:15:22 +01:00
Marko Mäkelä	685d958e38	MDEV-14425 Improve the redo log for concurrency The InnoDB redo log used to be formatted in blocks of 512 bytes. The log blocks were encrypted and the checksum was calculated while holding log_sys.mutex, creating a serious scalability bottleneck. We remove the fixed-size redo log block structure altogether and essentially turn every mini-transaction into a log block of its own. This allows encryption and checksum calculations to be performed on local mtr_t::m_log buffers, before acquiring log_sys.mutex. The mutex only protects a memcpy() of the data to the shared log_sys.buf, as well as the padding of the log, in case the to-be-written part of the log would not end in a block boundary of the underlying storage. For now, the "padding" consists of writing a single NUL byte, to allow recovery and mariadb-backup to detect the end of the circular log faster. Like the previous implementation, we will overwrite the last log block over and over again, until it has been completely filled. It would be possible to write only up to the last completed block (if no more recent write was requested), or to write dummy FILE_CHECKPOINT records to fill the incomplete block, by invoking the currently disabled function log_pad(). This would require adjustments to some logic around log checkpoints, page flushing, and shutdown. An upgrade after a crash of any previous version is not supported. Logically empty log files from a previous version will be upgraded. An attempt to start up InnoDB without a valid ib_logfile0 will be refused. Previously, the redo log used to be created automatically if it was missing. Only with with innodb_force_recovery=6, it is possible to start InnoDB in read-only mode even if the log file does not exist. This allows the contents of a possibly corrupted database to be dumped. Because a prepared backup from an earlier version of mariadb-backup will create a 0-sized log file, we will allow an upgrade from such log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system tablespace looks valid. The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced with 64-byte log checkpoint blocks at 0x1000 and 0x2000. The start of log records will move from 0x800 to 0x3000. This allows us to use 4096-byte aligned blocks for all I/O in a future revision. We extend the MDEV-12353 redo log record format as follows. (1) Empty mini-transactions or extra NUL bytes will not be allowed. (2) The end-of-minitransaction marker (a NUL byte) will be replaced with a 1-bit sequence number, which will be toggled each time when the circular log file wraps back to the beginning. (3) After the sequence bit, a CRC-32C checksum of all data (excluding the sequence bit) will written. (4) If the log is encrypted, 8 bytes will be written before the checksum and included in it. This is part of the initialization vector (IV) of encrypted log data. (5) File names, page numbers, and checkpoint information will not be encrypted. Only the payload bytes of page-level log will be encrypted. The tablespace ID and page number will form part of the IV. (6) For padding, arbitrary-length FILE_CHECKPOINT records may be written, with all-zero payload, and with the normal end marker and checksum. The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON. In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup will require a valid log file. When resizing the log, we will create a logically empty ib_logfile101 at the current LSN and use an atomic rename to replace ib_logfile0 with it. See the test innodb.log_file_size. Because there is no mandatory padding in the log file, we are able to create a dummy log file as of an arbitrary log sequence number. See the test mariabackup.huge_lsn. The parameter innodb_log_write_ahead_size and the INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed. The minimum value of innodb_log_buffer_size will be increased to 2MiB (because log_sys.buf will replace recv_sys.buf) and the increment adjusted to 4096 bytes (the maximum log block size). The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed: os_log_fsyncs os_log_pending_fsyncs log_pending_log_flushes log_pending_checkpoint_writes The following status variables will be removed: Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs) Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design) log_sys.get_block_size(): Return the physical block size of the log file. This is only implemented on Linux and Microsoft Windows for now, and for the power-of-2 block sizes between 64 and 4096 bytes (the minimum and maximum size of a checkpoint block). If the block size is anything else, the traditional 512-byte size will be used via normal file system buffering. If the file system buffers can be bypassed, a message like the following will be issued: InnoDB: File system buffers for log disabled (block size=512 bytes) InnoDB: File system buffers for log disabled (block size=4096 bytes) This has been tested on Linux and Microsoft Windows with both sizes. On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC. Tests in 3 different environments where the log is stored in a device with a physical block size of 512 bytes are yielding better throughput without O_DIRECT. This could be due to the fact that in the event the last log block is being overwritten (if multiple transactions would become durable at the same time, and each of will write a small number of bytes to the last log block), it should be faster to re-copy data from log_sys.buf or log_sys.flush_buf to the kernel buffer, to be finally written at fdatasync() time. The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for data files. This option will enable O_DIRECT on the log file on Linux. It may be unsafe to use when the storage device does not support FUA (Force Unit Access) mode. When the server is compiled WITH_PMEM=ON, we will use memory-mapped I/O for the log file if the log resides on a "mount -o dax" device. We will identify PMEM in a start-up message: InnoDB: log sequence number 0 (memory-mapped); transaction id 3 On Linux, we will also invoke mmap() on any ib_logfile0 that resides in /dev/shm, effectively treating the log file as persistent memory. This should speed up "./mtr --mem" and increase the test coverage of PMEM on non-PMEM hardware. It also allows users to estimate how much the performance would be improved by installing persistent memory. On other tmpfs file systems such as /run, we will not use mmap(). mariadb-backup: Eliminated several variables. We will refer directly to recv_sys and log_sys. backup_wait_for_lsn(): Detect non-progress of xtrabackup_copy_logfile(). In this new log format with arbitrary-sized blocks, we can only detect log file overrun indirectly, by observing that the scanned log sequence number is not advancing. xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit, because we are not allowed to modify the server's log file, and our memory mapping is read-only. trx_flush_log_if_needed_low(): Do not use the callback on pmem. Using neither flush_lock nor write_lock around PMEM writes seems to yield the best performance. The pmem_persist() calls may still be somewhat slower than the pwrite() and fdatasync() based interface (PMEM mounted without -o dax). recv_sys_t::buf: Remove. We will use log_sys.buf for parsing. recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE. recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn. recv_sys_t, log_sys_t: Removed many data members. recv_sys.lsn: Renamed from recv_sys.recovered_lsn. recv_sys.offset: Renamed from recv_sys.recovered_offset. log_sys.buf_size: Replaces srv_log_buffer_size. recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset] when the buffer is being allocated from the memory heap. recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is backed by ib_logfile0. The pointer will wrap from recv_sys.len (log_sys.file_size) to log_sys.START_OFFSET. For the record that wraps around, we may copy file name or record payload data to the auxiliary buffer decrypt_buf in order to have a contiguous block of memory. The maximum size of a record is less than innodb_page_size bytes. recv_sys_t::parse(): Take the smart pointer as a template parameter. Do not temporarily add a trailing NUL byte to FILE_ records, because we are not supposed to modify the memory-mapped log file. (It is attached in read-write mode already during recovery.) recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse(). recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be returned on PMEM, use recv_ring to wrap around the buffer to the start. mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free on PMEM, because it has no meaning on the mmap-based log. log_sys.write_to_buf: Count writes to log_sys.buf. Replaces srv_stats.log_write_requests and export_vars.innodb_log_write_requests. Protected by log_sys.mutex. Updated consistently in log_close(). Previously, mtr_t::commit() conditionally updated the count, which was inconsistent. log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf, for writing to log_sys.log (the ib_logfile0). Replaces srv_stats.log_writes and export_vars.innodb_log_writes. Protected by log_sys.mutex. log_sys.waits: Count waits in append_prepare(). Replaces srv_stats.log_waits and export_vars.innodb_log_waits. recv_recover_page(): Do not unnecessarily acquire log_sys.flush_order_mutex. We are inserting the blocks in arbitary order anyway, to be adjusted in recv_sys.apply(true). We will change the definition of flush_lock and write_lock to avoid potential false sharing. Depending on sizeof(log_sys) and CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could share a cache line with each other or with the last data members of log_sys. Thanks to Matthias Leich for providing https://rr-project.org traces for various failures during the development, and to Thirunarayanan Balathandayuthapani for his help in debugging some of the recovery code. And thanks to the developers of the rr debugger for a tool without which extensive changes to InnoDB would be very challenging to get right. Thanks to Vladislav Vaintroub for useful feedback and to him, Axel Schwenke and Krunal Bauskar for testing the performance.	2022-01-21 16:03:47 +02:00
Marko Mäkelä	c22107fd90	Merge 10.6 into 10.7	2021-11-29 11:42:07 +02:00
Marko Mäkelä	51c89849d1	Merge 10.5 into 10.6	2021-11-29 11:39:34 +02:00
Marko Mäkelä	d4cb177603	Merge 10.4 into 10.5	2021-11-29 11:16:20 +02:00
Marko Mäkelä	4da2273876	Merge 10.3 into 10.4	2021-11-29 10:59:22 +02:00
Marko Mäkelä	289721de9a	Merge 10.2 into 10.3	2021-11-29 10:33:06 +02:00
Alexey Bychko	fe065f8d90	MDEV-22522 RPM packages have meaningless summary/description this patch moves cpack summury and description for optional packages to the appropriate CMakeLists.txt files	2021-11-23 11:29:24 +07:00
Vladislav Vaintroub	009f3e06f3	improve build, allow sql library to be built in parallel with builtins	2021-11-09 17:02:45 +02:00
Sergei Krivonos	f7c6c02a06	Revert "improve build, allow sql library to be built in parallel with builtins" This reverts commit `1a3570dec3`.	2021-11-09 15:44:07 +02:00
Vladislav Vaintroub	1a3570dec3	improve build, allow sql library to be built in parallel with builtins	2021-11-09 12:06:49 +02:00
Sergei Golubchik	db20c77782	mariabackup: rename encryption_plugin -> xb_plugin because plugin code is not only about encryption anymore (also loads provider plugins), and xb_ prefix prevents name clashes with the server code (that mariabackup links with).	2021-10-27 15:55:14 +02:00
Oleksandr Byelkin	6efb5e9f5e	Merge branch '10.5' into 10.6	2021-08-02 10:11:41 +02:00
Oleksandr Byelkin	ae6bdc6769	Merge branch '10.4' into 10.5	2021-07-31 23:19:51 +02:00
Oleksandr Byelkin	7841a7eb09	Merge branch '10.3' into 10.4	2021-07-31 22:59:58 +02:00
Marko Mäkelä	b50ea90063	Merge 10.2 into 10.3	2021-07-22 18:57:54 +03:00
Heinz Wiesinger	751ebe44fd	Add feature summary at the end of cmake. This gives a short overview over found/missing dependencies as well as enabled/disabled features. Initial author Heinz Wiesinger <heinz@m2mobi.com> Additions by Vicențiu Ciorbaru <vicentiu@mariadb.org> * Report all plugins enabled via MYSQL_ADD_PLUGIN * Simplify code. Eliminate duplication by making use of WITH_xxx variable values to set feature "ON" / "OFF" state. Reviewed by: wlad@mariadb.com (code details) serg@mariadb.com (the idea)	2021-07-21 10:22:56 +03:00
Vladislav Vaintroub	9701759b3d	MDEV-23043 Refactor Windows service handling Removed the existing nt_service classes - they provide little abstraction, and only obscure a relatively simple service handling. This replaces by similar code inspired by MS docs samples. Service handling is now moved into winmain.cc, which contains the main() function for Windows. winmain provides reporting callbacks, which should be used by external code ,to report transitions from starting to running to shutting down to stopped. Removed a do-nothing ServiceMain thread, and the non-working service "pause/continue". Removed a lot of #ifdef __WIN__ code from mysqld.cc	2020-07-04 18:24:40 +02:00
mysqlonarm	dec3f8ca69	MDEV-22641: Provide SIMD optimized wrapper for zlib crc32() (#1558 ) Existing implementation used my_checksum (from mysys) for calculating table checksum and binlog checksum. This implementation was optimized for powerpc only and lacked SIMD implementation for x86 (using clmul) and ARM (using ACLE) instead used zlib-crc32. mariabackup had its own copy of the crc32 implementation using hardware optimized implementation only for x86 and lagged hardware based implementation for powerpc and ARM. Patch helps unifies all such calls and help aggregate all of them using an unified interface my_checksum(). Said unification also enables hardware optimized calls for all architecture viz. x86, ARM, POWERPC. Default always fallback to zlib crc32. Thanks to Daniel Black for reviewing, fixing and testing PowerPC changes. Thanks to Marko and Daniel for early code feedback.	2020-06-01 11:34:06 +03:00
Rasmus Johansson	9e1b3af4a4	MDEV-21303 Make executables MariaDB named To change all executables to have a mariadb name I had to: - Do name changes in every CMakeLists.txt that produces executables - CREATE_MARIADB_SYMLINK was removed and GET_SYMLINK added by Wlad to reuse the function in other places also - The scripts/CMakeLists.txt could make use of GET_SYMLINK instead of introducing redundant code, but I thought I'll leave that for next release - A lot of changes to debian/.install and debian/.links files due to swapping of real executable and symlink. I did not however change the name of the manpages, so the real name is still mysql there and mariadb are symlinks. - The Windows part needed a change now when we made the executables mariadb -named. MSI (and ZIP) do not support symlinks and to not break backward compatibility we had to include mysql named binaries also. Done by Wlad	2020-03-21 20:20:29 +01:00
Marko Mäkelä	a983b24407	Merge 10.4 into 10.5	2020-01-28 14:17:09 +02:00
Oleksandr Byelkin	bfc24bb2ec	Merge branch '10.3' into 10.4	2020-01-24 14:50:23 +01:00
Oleksandr Byelkin	ceda5f724f	Merge branch '10.2' into 10.3	2020-01-24 14:16:20 +01:00
Oleksandr Byelkin	f2ccfcaca1	Merge branch '10.1' into 10.2	2020-01-24 13:46:49 +01:00
Julius Goryavsky	982294ac16	MDEV-17601: MariaDB Galera does not expect 'mbstream' as streamfmt Setting "streamfmt=mbstream" in the "[sst]" section causes SST to fail because the format automatically switches to 'tar' by default (insead of mbstream). To fix this, we need to add mbstream to the list of valid values for the format, making it synonymous with xbstream. This must be done both in the SST script and when parsing the options of the corresponding utilities.	2020-01-21 10:50:48 +01:00
Sergei Golubchik	ff5a528f26	mysqltest crashes on Debian Debian is apparently offended that pcre2-posix implements POSIX API, thus it renames all posix-compatible symbols in libpcre2-posix to have the PCRE2 prefix. But Debian doesn't do anything to pcre2posix.h header, so any unaware application will get POSIX compatible type names and function prototypes from pcre2, but actual symbols will come from libc. To remedy this enormous incongruity we have to redefine POSIX-compatible function names in pcre2posix to match Debian's hack.	2020-01-16 18:13:55 +01:00
Alexey Botchkov	9dadfdcde5	MDEV-14024 PCRE2. Related changes in the server code.	2019-12-21 10:34:02 +01:00
Vladislav Vaintroub	6dc71d4f10	improve build, allow sql library to be built in parallel with builtins	2019-06-30 17:48:19 +02:00
Vladislav Vaintroub	5e4b657dd4	MDEV-18531 : Use WolfSSL instead of YaSSL as "bundled" SSL/encryption library - Add new submodule for WolfSSL - Build and use wolfssl and wolfcrypt instead of yassl/taocrypt - Use HAVE_WOLFSSL instead of HAVE_YASSL - Increase MY_AES_CTX_SIZE, to avoid compile time asserts in my_crypt.cc (sizeof(EVP_CIPHER_CTX) is larger on WolfSSL)	2019-05-22 13:48:25 +02:00
Oleksandr Byelkin	c07325f932	Merge branch '10.3' into 10.4	2019-05-19 20:55:37 +02:00
Marko Mäkelä	be85d3e61b	Merge 10.2 into 10.3	2019-05-14 17:18:46 +03:00
Marko Mäkelä	26a14ee130	Merge 10.1 into 10.2	2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru	cb248f8806	Merge branch '5.5' into 10.1	2019-05-11 22:19:05 +03:00
Brave Galera Crew	36a2a185fe	Galera4	2019-01-23 15:30:00 +04:00
Marko Mäkelä	77cbaa96ad	Merge 10.2 into 10.3	2019-01-17 12:38:46 +02:00
Vladislav Vaintroub	a8a27e65a8	MDEV-18212 mariabackup: Make output format uniform whenever possible	2019-01-15 14:15:04 +01:00
Sergei Golubchik	c9717dc019	Merge branch '10.2' into 10.3	2018-05-11 13:15:10 +02:00
Sergei Golubchik	9b1824dcd2	Merge branch '10.1' into 10.2	2018-05-10 13:01:42 +02:00
Vladislav Vaintroub	f98496da96	MDEV-16105: Mariabackup does not support SSL The reason is the missing HAVE_OPENSSL define for mariabackup.	2018-05-08 19:52:08 +00:00
Sergei Golubchik	b1818dccf7	Merge branch '10.2' into 10.3	2018-03-28 17:31:57 +02:00
Sergei Golubchik	c764bc0a78	Merge branch '10.1' into 10.2	2018-03-25 13:02:52 +02:00
Vladislav Vaintroub	af86422f08	MDEV-13023 mariabackup does not preserve holes for page compressed tables. Changed "local" datasink logic to detect page compressed Innodb tables. Whenever such table is detected, holes in the copied files are created by skipping over binary zeros at the end of each compressed page.	2018-03-23 15:30:01 +00:00
Teemu Ollakka	33aad1d273	MDEV-15505 Fixes to compilation without -DWITH_WSREP:BOOL=ON Removed including wsrep_api.h from service_wsrep.h. This caused various kinds of collisions with definitions when wsrep is not supposed to be built in. Defined functions wsrep_xid_seqno() and wsrep_xid_uuid() in wsrep_dummy.cc. Replaced wsrep_seqno_t with long long where wsrep_api.h is not included. Removed wsrep_xid_seqno() macro from wsrep_mysqld.h and made wsrep code using wsrep_xid_seqno() in handler.cc to be compiled in only if WITH_WSREP is ON. Included wsrep_api.h for mariabackup if WITH_WSREP is ON.	2018-03-21 12:02:09 +02:00
Sergei Golubchik	4040a17ea2	Compile mariabackup with its own copy of net_serv.cc Don't use the server's version, that expects a valid THD. Modify net_serv.cc not not use any THD if MYSQL_SERVER isn't defined. This reverts commit `aaddac5cd7`.	2017-08-23 19:05:13 +02:00
Vladislav Vaintroub	aaddac5cd7	fix compile errors	2017-08-23 08:27:46 +00:00
Marko Mäkelä	f20693c231	Remove a reference to a non-existent include directory	2017-07-07 11:57:39 +03:00
Marko Mäkelä	8c71c6aa8b	MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2 InnoDB I/O and buffer pool interfaces and the redo log format have been changed between MariaDB 10.1 and 10.2, and the backup code has to be adjusted accordingly. The code has been simplified, and many memory leaks have been fixed. Instead of the file name xtrabackup_logfile, the file name ib_logfile0 is being used for the copy of the redo log. Unnecessary InnoDB startup and shutdown and some unnecessary threads have been removed. Some help was provided by Vladislav Vaintroub. Parameters have been cleaned up and aligned with those of MariaDB 10.2. The --dbug option has been added, so that in debug builds, --dbug=d,ib_log can be specified to enable diagnostic messages for processing redo log entries. By default, innodb_doublewrite=OFF, so that --prepare works faster. If more crash-safety for --prepare is needed, double buffering can be enabled. The parameter innodb_log_checksums=OFF can be used to ignore redo log checksums in --backup. Some messages have been cleaned up. Unless --export is specified, Mariabackup will not deal with undo log. The InnoDB mini-transaction redo log is not only about user-level transactions; it is actually about mini-transactions. To avoid confusion, call it the redo log, not transaction log. We disable any undo log processing in --prepare. Because MariaDB 10.2 supports indexed virtual columns, the undo log processing would need to be able to evaluate virtual column expressions. To reduce the amount of code dependencies, we will not process any undo log in prepare. This means that the --export option must be disabled for now. This also means that the following options are redundant and have been removed: xtrabackup --apply-log-only innobackupex --redo-only In addition to disabling any undo log processing, we will disable any further changes to data pages during --prepare, including the change buffer merge. This means that restoring incremental backups should reliably work even when change buffering is being used on the server. Because of this, preparing a backup will not generate any further redo log, and the redo log file can be safely deleted. (If the --export option is enabled in the future, it must generate redo log when processing undo logs and buffered changes.) In --prepare, we cannot easily know if a partial backup was used, especially when restoring a series of incremental backups. So, we simply warn about any missing files, and ignore the redo log for them. FIXME: Enable the --export option. FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write a test that initiates a backup while an ALGORITHM=INPLACE operation is creating indexes or rebuilding a table. An error should be detected when preparing the backup. FIXME: In --incremental --prepare, xtrabackup_apply_delta() should ensure that if FSP_SIZE is modified, the file size will be adjusted accordingly.	2017-07-05 11:43:28 +03:00
Vladislav Vaintroub	ee4eda40b9	MDEV-12832 : remove libarchive support from mariabackup, due to different packaging issues. Also, Percona thinks that tar support has many limitations and should be removed as well( see discussion in https://bugs.launchpad.net/percona-xtrabackup/+bug/1681721) there is an alternative streaming format xbstream that is supported and does not have these limitations.	2017-05-21 22:19:06 +00:00
Vladislav Vaintroub	ecb25df21b	Xtrabackup 2.3.8	2017-04-27 19:12:42 +02:00
Vladislav Vaintroub	f344d7ec61	Make Ninja generator happy with BUILD_BYPRODUCTS.	2017-04-27 19:12:42 +02:00

1 2

52 commits