mariadb

mirror of https://github.com/MariaDB/server.git synced 2025-01-17 12:32:27 +01:00

Author	SHA1	Message	Date
Sergei Golubchik	bf2bdd1a1a	Merge branch '10.8' into 10.9	2022-05-19 14:07:55 +02:00
Sergei Golubchik	b7ffccf49b	Merge branch '10.7' into 10.8	2022-05-18 13:26:48 +02:00
Andrei	98ca71ab28	MDEV-28461 semisync-slave server recovery fails to rollback prepared transaction that is not in binlog. Post-crash recovery of --rpl-semi-sync-slave-enabled server failed to recognize a transaction in-doubt that needed rolled back. A prepared-but-not-in-binlog transaction gets committed instead to possibly create inconsistency with a master (e.g the way it was observed in the bug report). The semisync recovery is corrected now with initializing binlog coordinates of any transaction in-doubt to the maximum offset which is unreachable. In effect when a prepared transaction that is not found in binlog it will be decided to rollback because it's guaranteed to reside in a truncated tail area of binlog. Mtr tests are reinforced to cover the described scenario.	2022-05-18 09:48:57 +02:00
Sergei Golubchik	443c2a715d	Merge branch '10.7' into 10.8	2022-05-11 12:21:36 +02:00
Sergei Golubchik	3bc98a4ec4	Merge branch '10.5' into 10.6	2022-05-10 14:01:23 +02:00
Sergei Golubchik	ef781162ff	Merge branch '10.4' into 10.5	2022-05-09 22:04:06 +02:00
Sergei Golubchik	a70a1cf3f4	Merge branch '10.3' into 10.4	2022-05-08 23:03:08 +02:00
Sergei Golubchik	531935992a	test fixes for FreeBSD * FreeBSD returns errno 31 (EMLINK, Too many links), not 40 (ELOOP, Too many levels of symbolic links) * (`mysqlbinlog\|mysql`) was just crazy, why did it ever work? * socket_ipv6.inc check (that checked whether ipv6 is supported) only worked correctly when ipv6 was supported * perfschema.socket_summary_by_instance was changing global variables and then skip-ing the test (because on missing ipv6)	2022-05-04 19:34:20 +02:00
Marko Mäkelä	504a3b32f6	Merge 10.8 into 10.9	2022-04-28 15:54:03 +03:00
Marko Mäkelä	133c2129cd	Merge 10.7 into 10.8	2022-04-27 10:43:00 +03:00
Marko Mäkelä	fae0ccad6e	Merge 10.5 into 10.6	2022-04-21 17:46:40 +03:00
Daniel Black	580cbd18b3	Merge branch 10.4 into 10.5 A few of constaint -> constraint	2022-04-21 15:47:03 +10:00
Brandon Nesterenko	c132bce1a1	MDEV-20119: Implement the --do-domain-ids, --ignore-domain-ids, and --ignore-server-ids options for mysqlbinlog New Feature: ============ Extend mariadb-binlog command-line tool to allow for filtering events using GTID domain and server ids. The functionality mimics that of a replica server’s DO_DOMAIN_IDS, IGNORE_DOMAIN_IDS, and IGNORE_SERVER_IDS from CHANGE MASTER TO. For completeness, this patch additionally adds the option --do-server-ids as an alias for --server-id, which now accepts a list of server ids instead of a single one. Example usage: mariadb-binlog --do-domain-ids=2,3,4 --do-server-ids=1,3 master-bin.000001 Functional Notes: 1. --do-domain-ids cannot be combined with --ignore-domain-ids 2. --do-server-ids cannot be combined with --ignore-server-ids 3. A domain id filter can be combined with a server id filter 4. When any new filter options are combined with the --gtid-strict-mode option, events from excluded domains/servers are not validated. 5. Domain/server id filters can be combined with GTID ranges (i.e. specifications of --start-position and --stop-position). However, because the --stop-position option implicitly undertakes filtering to only output events within its range of domains, when combined with --do-domain-ids or --ignore-domain-ids, output will consist of the intersection between the filters. Specifically, with --do-domain-ids and --stop-position, only events with domain ids present in both argument lists will be output. Conversely, with --ignore-domain-ids and --stop-position, only events with domain ids present in the --stop-position and absent from the --ignore-domain-ids options will be output. Reviewed By ============ Andrei Elkin <andrei.elkin@mariadb.com>	2022-04-19 11:09:24 -06:00
Marko Mäkelä	5c69e93630	Merge 10.7 into 10.8	2022-03-30 09:34:07 +03:00
Marko Mäkelä	b242c3141f	Merge 10.5 into 10.6	2022-03-29 16:16:21 +03:00
Marko Mäkelä	d62b0368ca	Merge 10.4 into 10.5	2022-03-29 12:59:18 +03:00
Marko Mäkelä	ae6e214fd8	Merge 10.3 into 10.4	2022-03-29 11:13:18 +03:00
Marko Mäkelä	020e7d89eb	Merge 10.2 into 10.3	2022-03-29 09:53:15 +03:00
Brandon Nesterenko	cd88b0831f	DBAAS-7828: Primary/replica: configuration change of autocommit=0 can not be applied Problem: ======== When the mysql.gtid_slave_pos table uses the InnoDB engine, and mysqld starts, it reads the table and begins a transaction. After reading the value, it should end the transaction and release all associated locks. The bug reported in DBAAS-7828 shows that when autocommit is off, the locks are not released, resulting in indefinite hangs on future attempts to change gtid_slave_pos. In particular, the transaction was not properly finalized because thd->server_status was not updated to reflect the end of the transaction. Solution: ======== This patch updates the code to properly commit the transaction after reading gtid_slave_pos during mysqld start-up. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2022-03-24 12:00:40 -06:00
Brandon Nesterenko	174f1734a9	MDEV-14608: mysqlbinlog lastest backupfile size is 0 Problem: ======== When using mariadb-binlog with --raw and --stop-never, events from the master's currently active log file should be written to their respective log file specified by --result-file, and shown on-disk. There is a bug where mariadb-binlog does not flush the result file to disk when new events are received Solution: ======== Add a function call to flush mariadb-binlog’s result file after receiving an event in --raw mode. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2022-03-24 07:40:29 -06:00
Marko Mäkelä	9f5a3e5689	Merge 10.7 into 10.8	2022-03-15 18:18:07 +02:00
Hugo Wen	dafc5fb9c1	MDEV-27342: Fix issue of recovery failure using new server id Commit `6c39eaeb1` made the crash recovery dependent on server_id. The crash recovery could fail when restoring a new instance from original crashed data directory USING A NEW SERVER ID. The issue doesn't exist in previous major versions before 10.6. Root cause is when generating the input XID to be searched in the hash, server id is populated with the current server id. So if the server id changed when recovering, the XID couldn't be found in the hash due to server id doesn't match. This fix is to use original server id when creating the input XID object in function `xarecover_do_commit_or_rollback`. All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2022-03-14 19:57:10 -07:00
Marko Mäkelä	1596ef738c	Merge 10.7 into 10.8	2022-03-11 10:49:49 +02:00
Marko Mäkelä	be6f9593fe	Merge 10.5 into 10.6	2022-03-11 09:53:40 +02:00
Marko Mäkelä	81523baac6	Merge 10.4 into 10.5	2022-03-11 09:36:03 +02:00
Marko Mäkelä	22d2df8c6b	Merge 10.3 into 10.4	2022-03-11 09:26:42 +02:00
Vlad Lesin	1766a18e06	MDEV-19577 Replication does not work with innodb_autoinc_lock_mode=2 The first step for deprecating innodb_autoinc_lock_mode(see MDEV-27844) is: - to switch statement binlog format to ROW if binlog format is MIXED and the statement changes autoincremented fields - issue warnings if innodb_autoinc_lock_mode == 2 and binlog format is STATEMENT	2022-03-10 15:38:43 +03:00
Andrei	e7cf871dda	MDEV-24617 OPTIMIZE on a sequence causes unexpected ER_BINLOG_UNSAFE_STATEMENT The warning out of OPTIMIZE Statement is unsafe because it uses a system function was indeed counterfactual and was resulted by checking an insufficiently strict property of lex' sql_command_flags. Fixed with deploying an additional checking of weather the current sql command that modifes a share->non_determinstic_insert table is capable of generating ROW format events. The extra check rules out the unsafety to OPTIMIZE et al, while the existing check continues to do so to CREATE TABLE (which is perculiarly tagged as ROW-event generative sql command). As a side effect sql_sequence.binlog test gets corrected and binlog_stm_unsafe_warning.test is reinforced to add up an unsafe CREATE..SELECT test.	2022-03-10 13:38:07 +02:00
Marko Mäkelä	50fa94ea2b	Merge 10.7 into 10.8	2022-02-23 16:42:59 +02:00
Marko Mäkelä	164a6aa41c	Merge 10.5 into 10.6	2022-02-23 16:19:45 +02:00
Marko Mäkelä	b91a123d8c	Extend have_sanitizer with ASAN+UBSAN and MSAN Disable some tests that are too slow or big for MSAN.	2022-02-23 15:48:08 +02:00
Oleksandr Byelkin	4fb2cb1a30	Merge branch '10.7' into 10.8	2022-02-04 14:50:25 +01:00
Oleksandr Byelkin	f5c5f8e41e	Merge branch '10.5' into 10.6	2022-02-03 17:01:31 +01:00
Oleksandr Byelkin	cf63eecef4	Merge branch '10.4' into 10.5	2022-02-01 20:33:04 +01:00
Andrei	fe2d90cca9	MDEV-11675. Convert the new session var to bool type and test changes The new @@binlog_alter_two_phase is converted to `my_bool` type.	2022-01-31 22:57:39 +02:00
Brandon Nesterenko	a64991df9d	MDEV-4989: Support for GTID in mysqlbinlog This patch fixes two issues: First, it fixes test failure due to GTID List events having inconsistent ordering of domain ids. In particular, this patch ensures that a GTID list log event will have its GTIDs ordered by domain id (ascending) followed by sequence number (ascending). Second, it fixes an assert which could use an unintialized variable. Reviewed By: ============ Andrei Elkin <andrei.elkin@mariadb.com>	2022-01-31 09:30:23 -07:00
Oleksandr Byelkin	a576a1cea5	Merge branch '10.3' into 10.4	2022-01-30 09:46:52 +01:00
Oleksandr Byelkin	41a163ac5c	Merge branch '10.2' into 10.3	2022-01-29 15:41:05 +01:00
Sachin	0c5d1342ae	MDEV-11675 Lag Free Alter On Slave This commit implements two phase binloggable ALTER. When a new @@session.binlog_alter_two_phase = YES ALTER query gets logged in two parts, the START ALTER and the COMMIT or ROLLBACK ALTER. START Alter is written in binlog as soon as necessary locks have been acquired for the table. The timing is such that any concurrent DML:s that update the same table are either committed, thus logged into binary log having done work on the old version of the table, or will be queued for execution on its new version. The "COMPLETE" COMMIT or ROLLBACK ALTER are written at the very point of a normal "single-piece" ALTER that is after the most of the query work is done. When its result is positive COMMIT ALTER is written, otherwise ROLLBACK ALTER is written with specific error happened after START ALTER phase. Replication of two-phase binloggable ALTER is cross-version safe. Specifically the OLD slave merely does not recognized the start alter part, still being able to process and memorize its gtid. Two phase logged ALTER is read from binlog by mysqlbinlog to produce BINLOG 'string', where 'string' contains base64 encoded Query_log_event containing either the start part of ALTER, or a completion part. The Query details can be displayed with `-v` flag, similarly to ROW format events. Notice, mysqlbinlog output containing parts of two-phase binloggable ALTER is processable correctly only by binlog_alter_two_phase server. @@log_warnings > 2 can reveal details of binlogging and slave side processing of the ALTER parts. The current commit also carries fixes to the following list of reported bugs: MDEV-27511, MDEV-27471, MDEV-27349, MDEV-27628, MDEV-27528. Thanks to all people involved into early discussion of the feature including Kristian Nielsen, those who helped to design, implement and test: Sergei Golubchik, Andrei Elkin who took the burden of the implemenation completion, Sujatha Sivakumar, Brandon Nesterenko, Alice Sherepa, Ramesh Sivaraman, Jan Lindstrom.	2022-01-27 21:25:07 +02:00
Andrei	8d9b1aa0d6	MDEV-27536 incremental commit to correct regression test.	2022-01-27 13:44:39 +02:00
Andrei	2ef12cab42	MDEV-27536 invalid BINLOG_BASE64_EVENT and assertion Diagnostics_area:: !is_set() The assert was caused by an error of XA transaction that had BINLOG 'base64_string' statement. The statement failed because of lack of checking whether the encoded replication event was handled by the slave applier thread. If it's not the slave applier no error should be generated, but it was in this case, see a test added. Fixed along with the idea borrowed the upstream to introduce a check of which applier executes the replication event and do not report any error if the applier is a regular server client.	2022-01-27 12:28:01 +02:00
Brandon Nesterenko	79e3ee00fa	MDEV-4989: Support for GTID in mysqlbinlog New Feature: =========== This commit extends the mariadb-binlog capabilities to allow events to be filtered by GTID ranges. More specifically, the --start-position and --stop-position arguments have been extended to accept values formatted as a list of GTID positions, e.g. --start-position=0-1-0,1-2-55. The following specific capabilities are addressed: 1) GTIDs can be used to filter results on local binlog files 2) GTIDs can be used to filter results from remote servers 3) Implemented --gtid-strict-mode that ensures the GTID event stream in each domain is monotonically increasing 4) Added new level of verbosity in mysqlbinlog -vvv to print additional diagnostic information/warnings about invalid GTID states 5) For a given GTID range, its start and stop position parameters aim to mimic the behaviors of CHANGE MASTER TO MASTER_USE_GTID=slave_pos and START SLAVE UNTIL master_gtid_pos=<GTID>, respectively. In particular, the start-position list expresses a gtid state of the server, similarly to how @@global.gtid_slave_pos expresses the gtid state of a slave server when connecting to a master with MASTER_USE_GTID=slave_pos. The GTID start-position list is exclusive and the stop-position list is inclusive. This allows users to receive events strictly after those that they already have, and is useful in cases of point in (logical) time recovery including 1) events were received out of order and should be re-sent, or 2) specifying the gtid state of a slave to get events newer than their current state. If a seq_no is 0 for start-position, it means to include the entirety of the domain. If a seq_no is 0 for stop-position, it means to exclude all events from that domain. The GTIDs provided in a start position argument must match with the GTID state of the first processed log (i.e. those listed in the Gtid_list event). If a stop position is provided, the events that are output are limited to only those with domain ids listed in the argument. When specifying combinations of start and stop positions, the following behaviors are expected: [--start-position without --stop-position]: Events that have domain ids in the start position are output if their seq_no occurs after the respective start position. Events with domain ids that are unspecified in the start position list are also output. Note that if the Gtid_list event of the first binary log is populated (i.e. non-empty), each domain in the Gtid_list must be present in the start-position list with a seq_no at or after the listed value. This behavior mimics how a slave only processes events after the state provided by @@global.gtid_slave_pos when connecting to a master with CHANGE MASTER TO MASTER_USE_GTID=slave_pos. [--stop-position without --start-position]: Output is limited to only events with both 1) domain ids that are present in the given stop position list and 2) seq_nos that are less than or equal to their respective stop GTID. Once all GTIDs in the stop position list have been processed, the program will stop processing log files. This behavior mimics how START SLAVE UNTIL master_gtid_pos=<G> has a slave only process events with domain ids present in G with their seq_nos at or before the respective gtid. [--start-position and --stop-position]: Output consists of the intersection between the events permitted by both the start and stop position rules. More concretely, the output can be defined by a union of the following rules: 1. For domains which exist in both the start and stop position lists, the events which exist in-between these positions (exclusive start, inclusive stop) are output 2. For all other events, the rules of [--stop-position without --start-position] are followed This is due to the implicit filtering within each individual rule. Even though the start position rule always includes events from unspecified domains, the stop position rule takes precedence because it always excludes events from unspecified domains. In other words, events which the start position rule would have included would then always be excluded by the stop position rule. [neither --start-position nor --stop-position]: Events are not omitted based on GTID positioning; however, --gtid-strict-mode and -vvv can still analyze gtid correctness for warning and error reporting. [repeated specification of --start-position or --stop-position]: Subsequent specifications of start and stop positions completely override previous ones. E.g., if invoked as mysqlbinlog --start-position=<G1> --start-position=<G2> ... All GTIDs specified in G1 are ignored and only those specified in G2 are used for the start position. A few additional notes: 1) this commit squashes together the commits: f4319661120e-78a9d49907ba 2) Changed rpl.rpl_blackhole_row_annotate test because it has out of order GTIDs in its binlog, so I added --skip-gtid-strict-mode 3) After all binlog events have been written, the session server id and domain id are reset to their values in the global state Reviewed By: =========== Andrei Elkin: <andrei.elkin@mariadb.com>	2022-01-26 14:17:21 -07:00
Marko Mäkelä	685d958e38	MDEV-14425 Improve the redo log for concurrency The InnoDB redo log used to be formatted in blocks of 512 bytes. The log blocks were encrypted and the checksum was calculated while holding log_sys.mutex, creating a serious scalability bottleneck. We remove the fixed-size redo log block structure altogether and essentially turn every mini-transaction into a log block of its own. This allows encryption and checksum calculations to be performed on local mtr_t::m_log buffers, before acquiring log_sys.mutex. The mutex only protects a memcpy() of the data to the shared log_sys.buf, as well as the padding of the log, in case the to-be-written part of the log would not end in a block boundary of the underlying storage. For now, the "padding" consists of writing a single NUL byte, to allow recovery and mariadb-backup to detect the end of the circular log faster. Like the previous implementation, we will overwrite the last log block over and over again, until it has been completely filled. It would be possible to write only up to the last completed block (if no more recent write was requested), or to write dummy FILE_CHECKPOINT records to fill the incomplete block, by invoking the currently disabled function log_pad(). This would require adjustments to some logic around log checkpoints, page flushing, and shutdown. An upgrade after a crash of any previous version is not supported. Logically empty log files from a previous version will be upgraded. An attempt to start up InnoDB without a valid ib_logfile0 will be refused. Previously, the redo log used to be created automatically if it was missing. Only with with innodb_force_recovery=6, it is possible to start InnoDB in read-only mode even if the log file does not exist. This allows the contents of a possibly corrupted database to be dumped. Because a prepared backup from an earlier version of mariadb-backup will create a 0-sized log file, we will allow an upgrade from such log files, provided that the FIL_PAGE_FILE_FLUSH_LSN in the system tablespace looks valid. The 512-byte log checkpoint blocks at 0x200 and 0x600 will be replaced with 64-byte log checkpoint blocks at 0x1000 and 0x2000. The start of log records will move from 0x800 to 0x3000. This allows us to use 4096-byte aligned blocks for all I/O in a future revision. We extend the MDEV-12353 redo log record format as follows. (1) Empty mini-transactions or extra NUL bytes will not be allowed. (2) The end-of-minitransaction marker (a NUL byte) will be replaced with a 1-bit sequence number, which will be toggled each time when the circular log file wraps back to the beginning. (3) After the sequence bit, a CRC-32C checksum of all data (excluding the sequence bit) will written. (4) If the log is encrypted, 8 bytes will be written before the checksum and included in it. This is part of the initialization vector (IV) of encrypted log data. (5) File names, page numbers, and checkpoint information will not be encrypted. Only the payload bytes of page-level log will be encrypted. The tablespace ID and page number will form part of the IV. (6) For padding, arbitrary-length FILE_CHECKPOINT records may be written, with all-zero payload, and with the normal end marker and checksum. The minimum size is 7 bytes, or 7+8 with innodb_encrypt_log=ON. In mariadb-backup and in Galera snapshot transfer (SST) scripts, we will no longer remove ib_logfile0 or create an empty ib_logfile0. Server startup will require a valid log file. When resizing the log, we will create a logically empty ib_logfile101 at the current LSN and use an atomic rename to replace ib_logfile0 with it. See the test innodb.log_file_size. Because there is no mandatory padding in the log file, we are able to create a dummy log file as of an arbitrary log sequence number. See the test mariabackup.huge_lsn. The parameter innodb_log_write_ahead_size and the INFORMATION_SCHEMA.INNODB_METRICS counter log_padded will be removed. The minimum value of innodb_log_buffer_size will be increased to 2MiB (because log_sys.buf will replace recv_sys.buf) and the increment adjusted to 4096 bytes (the maximum log block size). The following INFORMATION_SCHEMA.INNODB_METRICS counters will be removed: os_log_fsyncs os_log_pending_fsyncs log_pending_log_flushes log_pending_checkpoint_writes The following status variables will be removed: Innodb_os_log_fsyncs (this is included in Innodb_data_fsyncs) Innodb_os_log_pending_fsyncs (this was limited to at most 1 by design) log_sys.get_block_size(): Return the physical block size of the log file. This is only implemented on Linux and Microsoft Windows for now, and for the power-of-2 block sizes between 64 and 4096 bytes (the minimum and maximum size of a checkpoint block). If the block size is anything else, the traditional 512-byte size will be used via normal file system buffering. If the file system buffers can be bypassed, a message like the following will be issued: InnoDB: File system buffers for log disabled (block size=512 bytes) InnoDB: File system buffers for log disabled (block size=4096 bytes) This has been tested on Linux and Microsoft Windows with both sizes. On Linux, only enable O_DIRECT on the log for innodb_flush_method=O_DSYNC. Tests in 3 different environments where the log is stored in a device with a physical block size of 512 bytes are yielding better throughput without O_DIRECT. This could be due to the fact that in the event the last log block is being overwritten (if multiple transactions would become durable at the same time, and each of will write a small number of bytes to the last log block), it should be faster to re-copy data from log_sys.buf or log_sys.flush_buf to the kernel buffer, to be finally written at fdatasync() time. The parameter innodb_flush_method=O_DSYNC will imply O_DIRECT for data files. This option will enable O_DIRECT on the log file on Linux. It may be unsafe to use when the storage device does not support FUA (Force Unit Access) mode. When the server is compiled WITH_PMEM=ON, we will use memory-mapped I/O for the log file if the log resides on a "mount -o dax" device. We will identify PMEM in a start-up message: InnoDB: log sequence number 0 (memory-mapped); transaction id 3 On Linux, we will also invoke mmap() on any ib_logfile0 that resides in /dev/shm, effectively treating the log file as persistent memory. This should speed up "./mtr --mem" and increase the test coverage of PMEM on non-PMEM hardware. It also allows users to estimate how much the performance would be improved by installing persistent memory. On other tmpfs file systems such as /run, we will not use mmap(). mariadb-backup: Eliminated several variables. We will refer directly to recv_sys and log_sys. backup_wait_for_lsn(): Detect non-progress of xtrabackup_copy_logfile(). In this new log format with arbitrary-sized blocks, we can only detect log file overrun indirectly, by observing that the scanned log sequence number is not advancing. xtrabackup_copy_logfile(): On PMEM, do not modify the sequence bit, because we are not allowed to modify the server's log file, and our memory mapping is read-only. trx_flush_log_if_needed_low(): Do not use the callback on pmem. Using neither flush_lock nor write_lock around PMEM writes seems to yield the best performance. The pmem_persist() calls may still be somewhat slower than the pwrite() and fdatasync() based interface (PMEM mounted without -o dax). recv_sys_t::buf: Remove. We will use log_sys.buf for parsing. recv_sys_t::MTR_SIZE_MAX: Replaces RECV_SCAN_SIZE. recv_sys_t::file_checkpoint: Renamed from mlog_checkpoint_lsn. recv_sys_t, log_sys_t: Removed many data members. recv_sys.lsn: Renamed from recv_sys.recovered_lsn. recv_sys.offset: Renamed from recv_sys.recovered_offset. log_sys.buf_size: Replaces srv_log_buffer_size. recv_buf: A smart pointer that wraps log_sys.buf[recv_sys.offset] when the buffer is being allocated from the memory heap. recv_ring: A smart pointer that wraps a circular log_sys.buf[] that is backed by ib_logfile0. The pointer will wrap from recv_sys.len (log_sys.file_size) to log_sys.START_OFFSET. For the record that wraps around, we may copy file name or record payload data to the auxiliary buffer decrypt_buf in order to have a contiguous block of memory. The maximum size of a record is less than innodb_page_size bytes. recv_sys_t::parse(): Take the smart pointer as a template parameter. Do not temporarily add a trailing NUL byte to FILE_ records, because we are not supposed to modify the memory-mapped log file. (It is attached in read-write mode already during recovery.) recv_sys_t::parse_mtr(): Wrapper for recv_sys_t::parse(). recv_sys_t::parse_pmem(): Like parse_mtr(), but if PREMATURE_EOF would be returned on PMEM, use recv_ring to wrap around the buffer to the start. mtr_t::finish_write(), log_close(): Do not enforce log_sys.max_buf_free on PMEM, because it has no meaning on the mmap-based log. log_sys.write_to_buf: Count writes to log_sys.buf. Replaces srv_stats.log_write_requests and export_vars.innodb_log_write_requests. Protected by log_sys.mutex. Updated consistently in log_close(). Previously, mtr_t::commit() conditionally updated the count, which was inconsistent. log_sys.write_to_log: Count swaps of log_sys.buf and log_sys.flush_buf, for writing to log_sys.log (the ib_logfile0). Replaces srv_stats.log_writes and export_vars.innodb_log_writes. Protected by log_sys.mutex. log_sys.waits: Count waits in append_prepare(). Replaces srv_stats.log_waits and export_vars.innodb_log_waits. recv_recover_page(): Do not unnecessarily acquire log_sys.flush_order_mutex. We are inserting the blocks in arbitary order anyway, to be adjusted in recv_sys.apply(true). We will change the definition of flush_lock and write_lock to avoid potential false sharing. Depending on sizeof(log_sys) and CPU_LEVEL1_DCACHE_LINESIZE, the flush_lock and write_lock could share a cache line with each other or with the last data members of log_sys. Thanks to Matthias Leich for providing https://rr-project.org traces for various failures during the development, and to Thirunarayanan Balathandayuthapani for his help in debugging some of the recovery code. And thanks to the developers of the rr debugger for a tool without which extensive changes to InnoDB would be very challenging to get right. Thanks to Vladislav Vaintroub for useful feedback and to him, Axel Schwenke and Krunal Bauskar for testing the performance.	2022-01-21 16:03:47 +02:00
Marko Mäkelä	3f5726768f	Merge 10.5 into 10.6	2022-01-04 09:26:38 +02:00
Andrei	80da35a326	MDEV-27365 CREATE-or-REPLACE SEQUENCE is binlogged without DDL flag CREATE-OR-REPLACE SEQUENCE is not logged with Gtid event DDL flag which affects its slave parallel execution. Unlike other DDL:s it can occur in concurrent execution with following transactions which can lead to various errors, including asserts like (mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) \|\| !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks(this) in MDL_context::acquire_lock. Fixed to wrap internal statement level commit with save- and-restore of TRANS_THD::m_unsafe_rollback_flags.	2022-01-03 17:39:23 +02:00
Rucha Deodhar	fad1d15326	MDEV-25460: Assertion `!is_set() \|\| (m_status == DA_OK_BULK && is_bulk_op())' failed in Diagnostics_area::set_ok_status in my_ok from mysql_sql_stmt_prepare Analysis: Before PREPARE is executed, binlog_format is STATEMENT. This PREPARE had SET STATEMENT which sets binlog_format to ROW. Now after PREPARE is done we reset the binlog_format (back to STATEMENT). But we have temporary table, it doesn't let changing binlog_format=ROW to binlog_format=STATEMENT and gives error which goes unreported. This unreported error eventually causes assertion failure. Fix: Change return type for LEX::restore_set_statement_var() to bool and make it return error state.	2021-12-28 16:59:29 +05:30
Julius Goryavsky	55bb933a88	Merge branch 10.4 into 10.5	2021-12-26 12:51:04 +01:00
Julius Goryavsky	681b7784b6	Merge branch 10.3 into 10.4	2021-12-25 12:13:03 +01:00
Julius Goryavsky	3376668ca8	Merge branch 10.2 into 10.3	2021-12-23 14:14:04 +01:00
forkfun	eafa2a1411	enable partition_open_files_limit test	2021-12-09 16:29:22 +01:00

1 2 3 4 5 ...

1556 commits