In commit 244fdc435d (MDEV-29438)
we made sure that if the preceding record is the page infimum record,
no more than 8 bytes will be read from it. But, if the data payload of
the being-inserted record is less than 8 bytes (this can happen in
secondary indexes), we must not compare all 8 bytes.
This was caught by a failure of the test gcol.innodb_virtual_basic
under MemorySanitizer and some builds with AddressSanitizer.
This bug was found in MariaDB Server 10.6 thanks to the
OPT_PAGE_CHECKSUM record that was implemented
in commit 4179f93d28 for catching
this type of recovery failures.
page_cur_insert_rec_low(): If the previous record is the page infimum,
correctly limit the end of the record. We do not want to copy data from
the header of the page supremum. This omission caused the incorrect
recovery of DB_TRX_ID in an instant ALTER TABLE metadata record, because
part of the DB_TRX_ID was incorrectly copied from the n_owned of the
page supremum, which in recovery would be updated after the copying,
but in normal operation would already have been updated at the time the
common prefix was being determined.
log_phys_t::apply(): If a data page is found to be corrupted, do not
flag the log corrupted but instead return a new status APPLIED_CORRUPTED
so that the caller may discard all log for this page. We do not want
the recovery of unrelated pages to fail in recv_recover_page().
No test case is included, because the known test case would only work
in 10.6, and even after this fix, it would trigger another bug in
instant ALTER TABLE crash recovery.
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Thanks to references from Brad Smith, BSDs use getmntinfo as
a system call for mounted filesystems.
Most BSDs return statfs structures, (and we use OSX's statfs64),
but NetBSD uses a statvfs structure.
Simplify Linux getmntent_r to just use getmntent.
AIX uses getmntent.
An attempt at writing Solaris compatibility with
a small bit of HPUX compatibility was made based on man page
entries only. Fixes welcome.
statvfs structures now use f_bsize for consistency with statfs
Test case adjusted as PATH_MAX is OS defined (e.g. 1023 on AIX)
Fixes: 0ee5cf837e
also fixes:
MDEV-27818: Disk plugin does not show zpool mounted devices
This is because zpool mounted point don't begin with /.
Due to the proliferation of multiple filesystem types since this
was written, we restrict the entries listed in the disks plugin
to excude:
* read only mount points (no point monitoring, and
includes squash, snaps, sysfs, procfs, cgroups...)
* mount points that aren't directories (excludes /etc/hostname and
similar mounts in containers). (getmntent (Linux/AIX) only)
* exclude systems where there is no capacity listed (excludes various
virtual filesystem types).
Reviewer: Sergei Golubchik
Add ORDER BY to make the test deterministic.
Add FLUSH TABLES to avoid crash recovery warnings about the table
mysql.plugin. This tends to occur on Valgrind, where the server
shutdown could presumably time out, resulting in a forced kill.
dict_table_rename_in_cache(), dict_table_get_highest_foreign_id():
Reserve sufficient space for the fkid[] buffer, and ensure that the
fkid[] will be NUL-terminated.
The fkid[] must accommodate both the database name (which is already
encoded in my_charset_filename) and the constraint name
(which must be converted to my_charset_filename) so that we can check
if it is in the format databasename/tablename_ibfk_1 (all encoded in
my_charset_filename).
trx_undo_page_report_rename(): Use the correct maximum length of
a table name. Both the database name and the table name can be up to
NAME_CHAR_LEN (64 characters) times 5 bytes per character in the
my_charset_filename encoding. They are not encoded in UTF-8!
fil_op_write_log(): Reserve the correct amount of log buffer for
a rename operation. The file name will be appended by
mlog_catenate_string().
rename_file_ext(): Reserve a large enough buffer for the file names.
empty identifier specified as `` ends up with a NULL LEX_CSTRING::str in lexer.
This is not considered correct in upper layers, for example in Compare_identifiers::operator().
Empty column name is usually avoided by a check_column_name() call while parsing,
and period name matches the column name completely.
Hence, this fix uses the mentioned call for verification, too.
Since the 10.5 split of the privileges, the required GRANTs
for various mariabackup operations has changed.
In the addition of tests, a number of mappings where incorrect:
The option --lock-ddl-per-table didn't require connection admin.
The option --safe-slave-backup requires SLAVE MONITOR even without
the --no-lock option.
This bug affected some queries with an IN/ALL/ANY predicand or an EXISTS
predicate whose subquery contained a GROUP BY clause that could be
eliminated. If this clause used a IN/ALL/ANY predicand whose left operand
was a single-value subquery then execution of the query caused a crash of
the server after invokation of remove_redundant_subquery_clauses().
The crash was caused by an attempt to exclude the unit for the single-value
subquery from the query tree for the second time by the function
Item_subselect::eliminate_subselect_processor().
This bug had been masked by the bug MDEV-28617 until a fix for the latter
that properly excluded units was pushed into 10.3.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
buf_defer_drop_ahi(): Remove. Ever since
commit c7f8cfc9e7 (MDEV-27700)
it is safe to invoke btr_search_drop_page_hash_index(block, true)
to remove an orphan adaptive hash index.
Any attempt to upgrade page latches is prone to deadlocks. Recently,
we observed a few hangs that involved nothing more than a small table
consisting of one clustered index page, one secondary index page and
some undo pages.
Removed use std::vector's ba push_back(), pop_back() to make it more
obvious that memory in the vectors won't be reallocated.
Also, "borrowed" elements can be debugged a little better now,
they are put into the start of the m_cache vector.
- InnoDB fts table initially added to LRU table cache
while creating the table. Later, table was marked
as non-evicted when we add the table to fts optimizer
list. Before marking the table as non-evicted, master
thread can try to evict the fts table.
- Race condition between fsp_get_available_space_in_free_extents()
and fsp_try_extend_data_file() while accessing space.free_limit.
Before calling fsp_get_available_space_in_free_extents(), take
shared lock on space->latch.
Problem:
========
When replicating SET DEFAULT ROLE, the pre-update check (i.e. that
in set_var_default_role::check()) tries to validate the existence of
the given rules/user even when the targeted tables are ignored. When
previously issued CREATE USER/ROLE commands are ignored by the
replica because of the replication filtering rules, this results in
an error because the targeted data does not exist.
Solution:
========
Before checking that the given roles/user exist of a SET DEFAULT
ROLE command, first ensure that the mysql.user and
mysql.roles_mapping tables are not excluded by replication filters.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
Sergei Golubchik <serg@mariadb.com>
Reason:
=======
Race condition between btr_search_drop_hash_index() and
btr_search_lazy_free(). One thread does resizing of buffer pool
and clears the ahi on all pages in the buffer pool, frees the
index and table while removing the last reference. At the same time,
other thread access index->heap in btr_search_drop_hash_index().
Solution:
=========
Acquire the respective ahi latch before checking index->freed()
btr_search_drop_page_hash_index(): Added new parameter to indicate
that drop ahi entries only if the index is marked as freed
btr_search_check_marked_free_index(): Acquire all ahi latches and
return true if the index was freed
TIMESTAMP columns were compared as strings in ALL/ANY comparison,
which did not work well near DST time change.
Changing ALL/ANY comparison to use "Native" representation to compare
TIMESTAMP columns, like simple comparison does.
Ever since commit 09177eadc3
the test innodb.row_format_redundant cannot work when the
data file was created with innodb_checksum_algorithm=full_crc32.
Backport of 10.5 commit a9d0bb12e6
Even though commit b817afaa1c passed
the test mariabackup.compress_qpress, that test turned out to be
too small to reveal one more problem that had previously been prevented
by the existence of ctrl_mutex. I did not realize that there can be
multiple concurrent callers to compress_write(). One of them is the
log copying thread; further callers are data file copying threads
(default: --parallel=1).
By default, there is only one compression worker thread
(--compress-threads=1).
compress_write(): Fix a race condition between threads that would
use the same worker thread object. Make thd->data_avail contain the
thread identifier of the submitter, and add thd->avail_cond to
notify other compress_write() threads that are waiting for a slot.
- While creating a new InnoDB segment, allocates the extent
before allocating the inode or page allocation even though
the pages are present in fragment segment. This patch does
reserve the extent when InnoDB ran out of fragment pages
in the tablespace.
ha_spider::field_exchange() returns NULL and that results in a crash
or a assertion failure in spider_db_open_item_ident().
In the first place, there seems to be no need to call field_exchange()
for printing an identity (column name with alias). Thus, we simply
remove the call.
Analysis: JSON_VALUE() returns "null" string instead of NULL pointer.
Fix: When the type is JSON_VALUE_NULL (which is also a scalar) set
null_value to true and return 0 instead of returning string.