prepare_inplace_alter_table_dict(): If the table will not be rebuilt,
preserve all of the original ROW_FORMAT, including the compressed
page size flags related to ROW_FORMAT=COMPRESSED.
btr_root_raise_and_insert(), btr_lift_page_up(),
rtr_page_split_and_insert(): Reset DB_FAIL from a failure to
copy records on a ROW_FORMAT=COMPRESSED page to DB_SUCCESS
before retrying.
This fixes a regression that was introduced by
commit 0b47c126e3 (MDEV-13542).
btr_root_raise_and_insert(): Remove a redundant condition.
btr_page_split_and_insert() will invoke btr_page_split_and_insert()
if needed.
Now INSERT, UPDATE, ALTER statements involving incompatible data type pairs, e.g.:
UPDATE TABLE t1 SET col_inet6=col_int;
INSERT INTO t1 (col_inet6) SELECT col_in FROM t2;
ALTER TABLE t1 MODIFY col_inet6 INT;
consistently return an error at the statement preparation time:
ERROR HY000: Illegal parameter data types inet6 and int for operation 'SET'
and abort the statement before starting interating rows.
This error is the same with what is raised for queries like:
SELECT col_inet6 FROM t1 UNION SELECT col_int FROM t2;
SELECT COALESCE(col_inet6, col_int) FROM t1;
Before this change the error was caught only during the execution time,
when a Field_xxx::store_xxx() was called for the very firts row.
The behavior was not consistent between various statements and could do different things:
- abort the statement
- set a column to the data type default value (e.g. '::' for INET6)
- set a column to NULL
A typical old error was:
ERROR 22007: Incorrect inet6 value: '1' for column `test`.`t1`.`a` at row 1
EXCEPTION:
Note, there is an exception: a multi-row INSERT..VALUES, e.g.:
INSERT INTO t1 (col_a,col_b) VALUES (a1,b1),(a2,b2);
checks assignment compability at the preparation time for the very first row only:
(col_a,col_b) vs (a1,b1)
Other rows are still checked at the execution time and return the old warnings
or errors in case of a failure. This is done because catching all rows at the
preparation time would change behavior significantly. So it still works
according to the STRICT_XXX_TABLES sql_mode flags and the table transaction ability.
This is too late to change this behavior in 10.7.
There is no a firm decision yet if a multi-row INSERT..VALUES
behavior will change in later versions.
On FreeBSD, tests run on persistent storage, and no asynchronous I/O
has been implemented. Warnings about 205-second waits on dict_sys.latch
may occur.
buf_page_print(): Dump the buffer page 32 bytes (64 hexadecimal digits)
per line. In this way, the limitation in mtr
("Data too long for column 'line'") will not be triggered.
Also, do not bother decoding the page contents, because everything
is present in the hexadecimal output.
dict_index_find_on_id_low(): Merge to dict_index_get_if_in_cache_low().
The direct call in buf_page_print() was prone to crashing, in case the
table definition was concurrently evicted or dropped from the
data dictionary cache.
Spider supports (or at least allows) INSERT DELAYED but the
documentation does not specify spider as a storage engine that supports
"INSERT DELAYED".
Also, although not mentioned in the documentation, "INSERT DELAYED" is
not intended to be executed inside a transaction, as can be seen from
the list of supported storage engines.
The current implementation allows executing a delayed insert on a
remote transactional table and this breaks the consistency ensured by
the transaction.
We too remove "internal_delayed", one of the Spider table parameters.
Documentation says,
> Whether to transmit existence of delay to remote servers when
> executing an INSERT DELAYED statement on local server.
This table parameter is only used for "INSERT DELAYED".
Reviewed by: Nayuta Yanagisawa
When the variable, spider_semi_table_lock, is 1, Spider sends
LOCK TABLES before each SQL execution. The feature is for
non-transactional remote tables and adds some overhead to query
executions.
We change the default value of the plugin variable to 0 and then
deprecate the variable because it is rare to use non-transactional
engines these days and the variable complicates the code.
The variable, spider_semi_table_lock_connection, should be too
deprecated because it is for tweaking the semi-table locking.
Deprecate the high availability feature of Spider as the feature is
beyond the scope of the federated storage engine and it complicates
the implementation of Spider.
Take into account that in preparation of a simple key cache for resizing no disk blocks might be assigned to it.
Reviewer: IgorBabaev <igor@mariadb.com>
Revert "TSAN: data race on vptr (ctor/dtor vs virtual call)"
This reverts commit 78084fa747.
This commit was done to please TSAN, which falsely reported an error
where there was none.
Yet as consequence, it could cause a real error, a crash in os_aio_free on
shutdown
fil_name_process(): If the recovery of a tablespace was deferred,
do invoke fil_ibd_load() even though the name in recv_spaces is
not changing. This allows us to recover from a situation where
there are many FILE_RENAME records, renaming a tablespace back
and forth, and a FILE_MODIFY record that had been written by
fil_names_clear().
Co-developed with: Thirunarayanan Balathandayuthapani
Fixed tpool timer implementation on POSIX.
Prior to this patch, under some specific rare circumstances (concurrency
related), timer callback execution might be skipped.
When attempting to recover a database with an incorrect encryption key,
the unencrypted page contents should be expected to differ from what
was written before recovery. Let us suppress some more messages.
This caused intermittent failures, depending on when the latest
log checkpoint was triggered.
Histogram_json_hb::range_selectivity() may return small negative
numbers due to rounding errors in the histogram.
Make sure the returned value is non-negative.
Add an assert to catch negative values that are not small.
(attempt #2)
trx_undo_rec_copy(): Return nullptr if the undo record is corrupted.
trx_undo_rec_get_undo_no(): Define inline with the declaration.
trx_purge_dummy_rec: Replaced with a -1 pointer.
row_undo_rec_get(), UndorecApplier::apply_undo_rec(): Check
if trx_undo_rec_copy() returned nullptr.
trx_purge_get_next_rec(): Return nullptr upon encountering any
corruption, to signal the end of purge.
On GNU/Linux, even though the C11 aligned_alloc() appeared in
GNU libc early on, some custom memory allocators did not
implement it until recently. For example, before
gperftools/gperftools@d406f22853
the free() in tcmalloc would fail to free memory that was
returned by aligned_alloc(), because the latter would map to the
built-in allocator of libc. The Linux specific memalign() has a
similar interface and is safer to use, because it has been
available for a longer time. For AddressSanitizer, we will use
aligned_alloc() so that the constraint on size can be enforced.
buf_tmp_reserve_compression_buf(): When HAVE_ALIGNED_ALLOC holds,
round up the size to be an integer multiple of the alignment.
pfs_malloc(): In the unit test stub, round up the size to be an
integer multiple of the alignment.
Table_cache_instance: Define the structure aligned at
the CPU cache line, and remove a pad[] data member.
Krunal Bauskar reported this to improve performance on ARMv8.
aligned_malloc(): Wrapper for the Microsoft _aligned_malloc()
and the ISO/IEC 9899:2011 <stdlib.h> aligned_alloc().
Note: The parameters are in the Microsoft order (size, alignment),
opposite of aligned_alloc(alignment, size).
Note: The standard defines that size must be an integer multiple
of alignment. It is enforced by AddressSanitizer but not by GNU libc
on Linux.
aligned_free(): Wrapper for the Microsoft _aligned_free() and
the standard free().
HAVE_ALIGNED_ALLOC: A new test. Unfortunately, support for
aligned_alloc() may still be missing on some platforms.
We will fall back to posix_memalign() for those cases.
HAVE_MEMALIGN: Remove, along with any use of the nonstandard memalign().
PFS_ALIGNEMENT (sic): Removed; we will use CPU_LEVEL1_DCACHE_LINESIZE.
PFS_ALIGNED: Defined using the C++11 keyword alignas.
buf_pool_t::page_hash_table::create(),
lock_sys_t::hash_table::create():
lock_sys_t::hash_table::resize(): Pad the allocation size to an
integer multiple of the alignment.
Reviewed by: Vladislav Vaintroub
There was a race condition between log_checkpoint_low() and
deleting or renaming data files. The scenario is as follows:
1. The buffer pool does not contain dirty pages.
2. A FILE_DELETE or FILE_RENAME record is written.
3. The checkpoint LSN will be moved ahead of the write of the record.
4. The server is killed before the file is actually renamed or deleted.
We will prevent this race condition by ensuring that a log checkpoint
cannot occur between the durable write and the file system operation:
1. Durably write the FILE_DELETE or FILE_RENAME record.
2. Perform the file system operation.
3. Allow any log checkpoint to proceed.
mtr_t::commit_file(): Implement the DELETE or RENAME logic.
fil_delete_tablespace(): Delegate some of the logic to
mtr_t::commit_file().
fil_space_t::rename(): Delegate some logic to mtr_t::commit_file().
Remove the debug injection point fil_rename_tablespace_failure_2
because we do test RENAME failures without any debug injection.
fil_name_write_rename_low(), fil_name_write_rename(): Remove.
Tested by Matthias Leich
This commit contains workaround for a bug known as 'Red Hat issue 1870279'
(connection reset by peer issue in socat versions 1.7.3.3 to 1.7.4.0) which
further causes crashes during SST using mariabackup (when openssl is used).
Also fixed broken logic of automatic generation of the Diffie-Hellman parameters
for socat version less than 1.7.3 (which defaults to 512-bit values instead of
2048-bit ones).
recv_recover_page(): Correct a debug assertion to refer to recv_sys.lsn,
which may be ahead of log_sys.lsn during non-final recovery batches.
In commit 685d958e38 (MDEV-14425)
when some redundant LSN fields were removed,
log_sys.log.scanned_lsn had been replaced with a reference to
log_sys.lsn instead of the more appropriate recv_sys.lsn.
recv_scan_log(): Remove a redundant call to log_sys.set_recovered_lsn().
It suffices to adjust the log_sys.lsn after parsing (and before starting
to apply) records for the last batch.
Note: Normally, log_sys.lsn must be the latest log sequence number.
Before the final recovery batch, this may be safely violated, because
log_write_up_to() will be a no-op. That function will be invoked by the
buf_flush_page_cleaner thread to initiate writes of recovered pages.
Recent adventures in liburing and btrfs have shown up some kernel
version dependent bugs. Having a bug report of accurace kernel version
can start to correlate these errors sooner.
On Linux, /proc/version contains the kernel version.
FreeBSD has kern.version (per man 8 sysctl), so include that too.
Example output:
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Core pattern: |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
Kernel version: Linux version 5.19.0-0.rc2.21.fc37.x86_64 (mockbuild@bkernel01.iad2.fedoraproject.org) (gcc (GCC) 12.1.1 20220507 (Red Hat 12.1.1-1), GNU ld version 2.38-14.fc37) #1 SMP PREEMPT_DYNAMIC Mon Jun 13 15:27:24 UTC 2022
Segmentation fault (core dumped)
it starts an EXPLAIN of a multi-table join and tries to KILL it.
no sync points.
depending on how fast the hareware is and optimizer development
it might kill EXPLAIN at some random point in time (generally unrelated
to the Bug#28598 it was supposed to test) or EXPLAIN might finish
before the KILL and the test will fail.
Work around MDEV-28718 for now, but also optimize the interation
of information_schema.SYSTEM_VARIABLES.
Add test case to show that tzinfo data into bootstrap is
desired functionality.
Bug report thanks to Dan Lenski of AWS.
We do not really care about the exact result; we only care that the
statistics will be accessed. The result could change depending on
when some statistics were updated in the background or when some
committed delete-marked rows were purged from other tables on
which persistent statistics are enabled.