(Polished initial patch by Alexey Botchkov)
Make the code handle DEFAULT values of any datatype
- Make Json_table_column::On_response::m_default be Item*, not LEX_STRING.
- Change the parser to use string literal non-terminals for producing
the DEFAULT value
-- Also, stop updating json_table->m_text_literal_cs for the DEFAULT
value literals as it is not used.
A server that was running with innodb_log_file_size=96M and
innodb_buffer_pool_size=6M had inserted some data into a table
that was subsequently dropped. When the server was killed and
restarted, an assertion failed in recv_sys_t::parse() while
a FSP_SIZE change was unnecessarily being processed during
the skip_the_rest: loop in recv_scan_log().
The ib_logfile0 contents was as follows:
1. The checkpoint start LSN points to the start of some mini-transaction.
2. There may be log records for modifying files for which a FILE_MODIFY
had been written before the checkpoint. These records were "purged"
by advancing the checkpoint.
3. At some point during the initial parsing with store=true the space
reserved for recv_sys.pages will run out and recv_scan_log() would switch
to the skip_the_rest: mode.
4. We encounter a log record for extending a tablespace that will be
deleted a bit later. This would trip the bogus debug assertion.
5. Later on, there would be a FILE_DELETE record for this tablespace.
6. The checkpoint end LSN points to a possibly empty sequence of
FILE_MODIFY records and a FILE_CHECKPOINT record. Recovery had parsed these
records first, before rewinding to the checkpoint start LSN.
7. There could be further records following the FILE_CHECKPOINT record.
Recovery will process all records until an inconsistency is found and
it is assumed that the end of the circular ib_logfile0 was reached.
recv_sys_t::parse(): For the template instantiation with store=false,
remove a debug assertion that could fail in a multi-batch recovery,
while recv_scan_log(false) would be in the skip_the_rest: loop.
It is very well possible that we have not encountered all FILE_ records
yet, and therefore we should not complain about unknown tablespaces.
Reviewed by: Debarun Banerjee
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.
We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").
We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.
Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.
The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().
Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.
It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.
::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.
xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
SSL_CTX_set_ciphersuites() sets the TLSv1.3 cipher suites.
SSL_CTX_set_cipher_list() sets the ciphers for TLSv1.2 and below.
The current TLS configuration logic will not perform SSL_CTX_set_cipher_list()
to configure TLSv1.2 ciphers if the call to SSL_CTX_set_ciphersuites() was
successful. The call to SSL_CTX_set_ciphersuites() is successful if any TLSv1.3
cipher suite is passed into `--ssl-cipher`.
This is a potential security vulnerability because users trying to restrict
specific secure ciphers for TLSv1.3 and TLSv1.2, would unknowingly still have
the database support insecure TLSv1.2 ciphers.
For example:
If setting `--ssl_cipher=TLS_AES_128_GCM_SHA256:ECDHE-RSA-AES128-GCM-SHA256`,
the database would still support all possible TLSv1.2 ciphers rather than only
ECDHE-RSA-AES128-GCM-SHA256.
The solution is to execute both SSL_CTX_set_ciphersuites() and
SSL_CTX_set_cipher_list() even if the first call succeeds.
This allows the configuration of exactly which TLSv1.3 and TLSv1.2 ciphers to
support.
Note that there is 1 behavior change with this. When specifying only TLSv1.3
ciphers to `--ssl-cipher`, the database will not support any TLSv1.2 cipher.
However, this does not impose a security risk and considering TLSv1.3 is the
modern protocol, this behavior should be fine.
All TLSv1.3 ciphers are still supported if only TLSv1.2 ciphers are specified
through `--ssl-cipher`.
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
It's read for every command execution, and during slave replication
for every applied event.
It's also planned to be used during write set applying, so it means
mostly every server thread is going to compete for the mutex covering
this variable, especially considering how rarely it changes.
Converting wsrep_ready to atomic relaxes the things.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Add wait_until_ready waits after wsrep_on is set on again to
make sure that node is ready for next step before continuing.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Stabilize test by reseting DEBUG_SYNC and add wait_condition
for expected table contents.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
The crash report terminates prematurely when Galera library was
not loaded.
As a fix, check whether the provider is loaded before shutting down
Galera connections.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Move memory allocations performed during Sys_var_gtid_binlog_state::do_check
to Sys_var_gtid_binlog_state::global_update where they will be freed before
the latter method returns.
A call to
dbug_print_join_prefix(join_positions, idx, s)
returns a const char* ponter to string with current join prefix,
including the table being added to it.
All the options that where in buildbot, should
be in the server making it accessible to all
without any special invocation.
If WITH_MSAN=ON, we want to make sure that the
compiler options are supported and it will result
in an error if not supported.
We make the -WITH_MSAN=ON append -stdlib=libc++
to the CXX_FLAGS if supported.
With SECURITY_HARDENING options the bootstrap
currently crashes, so for now, we disable SECRUITY_HARDENING
if there is MSAN enable.
Option WITH_DBUG_TRACE has no effect in MSAN builds.
Each time a listener socket becomes ready, MariaDB calls accept() ten
times (MAX_ACCEPT_RETRY), even if all but the first one return EAGAIN
because there are no more connections. This causes unnecessary CPU
usage - on our server, the CPU load of that thread, which does nothing
but accept(), saturates one CPU core by ~45%. The loop should stop
after the first EAGAIN.
Perf report:
11.01% mariadbd libc.so.6 [.] accept4
6.42% mariadbd [kernel.kallsyms] [k] finish_task_switch.isra.0
5.50% mariadbd [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
5.50% mariadbd [kernel.kallsyms] [k] syscall_enter_from_user_mode
4.59% mariadbd [kernel.kallsyms] [k] __fget_light
3.67% mariadbd [kernel.kallsyms] [k] kmem_cache_alloc
2.75% mariadbd [kernel.kallsyms] [k] fput
2.75% mariadbd [kernel.kallsyms] [k] mod_objcg_state
1.83% mariadbd [kernel.kallsyms] [k] __inode_wait_for_writeback
1.83% mariadbd [kernel.kallsyms] [k] __sys_accept4
1.83% mariadbd [kernel.kallsyms] [k] _raw_spin_unlock_irq
1.83% mariadbd [kernel.kallsyms] [k] alloc_inode
1.83% mariadbd [kernel.kallsyms] [k] call_rcu
Applied SR transaction on the child table was not BF aborted by TOI running
on the parent table for several reasons:
Although SR correctly collected FK-referenced keys to parent, TOI in Galera
disregards common certification index and simply sets itself to depend on
the latest certified write set seqno.
Since this write set was the fragment of SR transaction, TOI was allowed to
run in parallel with SR presuming it would BF abort the latter.
At the same time, DML transactions in the server don't grab MDL locks on
FK-referenced tables, thus parent table wasn't protected by an MDL lock from
SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction.
In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not
enough to trigger MDL conflict in Galera.
InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to
the fact that it was believed MDL locking should always produce conflicts correctly.
The fix brings conflict resolution rules similar to MDL-level checks to InnoDB,
thus accounting for the problematic case.
Apart from that, wsrep_thd_is_SR() is patched to return true only for executing
SR transactions. It should be safe as any other SR state is either the same as
for any single write set (thus making the two logically equivalent), or it reflects
an SR transaction as being aborting or prepared, which is handled separately in
BF-aborting logic, and for regular execution path it should not matter at all.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Starting with GCC 7 and clang 15, single-bit operations such as
fetch_or(1) & 1 are translated into 80386 instructions such as
LOCK BTS, instead of using the generic translation pattern
of emitting a loop around LOCK CMPXCHG.
Given that the oldest currently supported GNU/Linux distributions
ship GCC 7, and that older versions of GCC are out of support,
let us remove some work-arounds that are not strictly necessary.
If someone compiles the code using an older compiler, it will work
but possibly less efficiently.
srw_mutex_impl::HOLDER: Changed from 1U<<31 to 1 in order to
work around https://github.com/llvm/llvm-project/issues/37322
which is specific to setting the most significant bit.
srw_mutex_impl::WAITER: A multiplier of waiting requests.
This used to be 1, which would now collide with HOLDER.
fil_space_t::set_stopping(): Remove this unused function.
In MSVC we need _interlockedbittestandset() for LOCK BTS.
The issue is caused by a race between buf_page_create_low getting the
page from buffer pool hash and buf_LRU_free_page evicting it from LRU.
The issue is introduced in 10.6 by MDEV-27058
commit aaef2e1d8c
MDEV-27058: Reduce the size of buf_block_t and buf_page_t
The solution is buffer fix the page before releasing buffer pool mutex
in buf_page_create_low when x_lock_try fails to acquire the page latch.
log_t::persist(): Add a parameter holding_latch to specify
whether the caller is already holding exclusive log_sys.latch,
like log_write_and_flush() always is.
The code erroneously called sec_since_epoch() for dates with zeros,
e.g. '2024-00-01'.
Fixi: adding a test that the date does not have zeros before
calling TIME_to_native().
The code in my_strtoll10_mb2 and my_strtoll10_utf32
could hit undefinite behavior by negation of LONGLONG_MIN.
Fixing to avoid this.
Also, fixing my_strtoll10() in the same style.
The previous reduction produced a redundant warning on
CAST(_latin1'-9223372036854775808' AS SIGNED)
The code in my_strntoull_8bit() and my_strntoull_mb2_or_mb4()
could hit undefinite behavior by negating of LONGLONG_MIN.
Fixing the code to avoid this.
This commit adds support for legacy names for files such
as mariadb_backup_galera_info, mariadb_backup_checkpoints
and mariabackup_binlog_info to allow upgrading from old
to new server versions without stopping the galera cluster.
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"
Fix for v.10.5
Added ability to disable/enable (--disable_cursor_protocol/
--enable_cursor_protocol) cursor-protocol in tests. If
"--disable_cursor_protocol" is used then ps-protocol is also
disabled. With cursor-protocol prepare statement is executed
only once. For "--cursor-protocol" added filter for queries:
it is executed only for "SELECT" queries.
The loose regex for the MDEV-34539 test ended up
matching the opensuse in the path in buildbot.
Adjust to more complete regex including space,
backtick and \n, which becomes much less common
as a path name.
The loose regex for the MDEV-34539 test ended up
matching the opensuse in the path in buildbot.
Adjust to more complete regex including space,
backtick and \n, which becomes much less common
as a path name.
The failing test case validates Seconds_Behind_Master for a delayed
slave, while STOP SLAVE is executed during a delay. The test fixes
initially added to the test (commit b04c857596) added a table lock
to ensure a transaction could not finish before validating the
Seconds_Behind_Master field after SLAVE START, but did not address a
possibility that the transaction could finish before running the
STOP SLAVE command, which invalidates the validations for the rest
of the test case. Specifically, this would result in 1) a timeout in
“Waiting for table metadata lock” on the replica, which expects the
transaction to retry after slave restart and hit a lock conflict on
the locked tables (added in b04c857596), and 2) that
Seconds_Behind_Master should have increased, but did not.
The failure can be reproduced by synchronizing the slave to the master
before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).
This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
synchronize a MASTER_DELAY, rather than continually increase the
duration of the delay each time the test fails on buildbot. This is
to ensure that on slow machines, a delay does not pass before the
test gets a chance to validate results. Additionally, it decreases
overall test time because the test can continue immediately after
validation, thereby bypassing the remainder of a full delay for each
transaction.
A CHAR column cannot be longer than 1024, because
Binlog_type_info_fixed_string::Binlog_type_info_fixed_string
replies on this fact - it cannot store binlog metadata for longer columns.
In case of the filename character set mbmaxlen is equal to 5,
so only 1024/5=204 characters can fit into the 1024 limit.
- In strict mode:
Disallowing creation of a CHAR column with octet length grater than 1024.
- In non-strict mode:
Automatically convert CHAR with octet length>1024 into VARCHAR.