Commit graph

17943 commits

Author SHA1 Message Date
Sergei Golubchik
7842cab8c0 MDEV-34318 post-merge fix 2024-10-17 10:08:24 +02:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
c6e4ea682c Merge 10.5 into 10.6 2024-10-03 10:42:58 +03:00
Marko Mäkelä
6878c9d591 MDEV-35050 fixup: ./mtr --embedded 2024-10-03 10:40:58 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Marko Mäkelä
cc70ca7eab MDEV-35059 ALTER TABLE...IMPORT TABLESPACE with FULLTEXT SEARCH may corrupt the adaptive hash index
build_fts_hidden_table(): Correct a mistake that had been made in
commit 903ae30069 (MDEV-30655).
2024-10-02 11:09:31 +03:00
Sergei Petrunia
1cda4726ca MDEV-34993, part2: backport optimizer_adjust_secondary_key_costs
...and make the fix for MDEV-34993 switchable. It is enabled by default
and controlled with @optimizer_adjust_secondary_key_costs=fix_card_multiplier
2024-10-02 10:52:09 +03:00
Sergei Petrunia
8166a5d33d MDEV-34993: Incorrect cardinality estimation causes poor query plan
When calculate_cond_selectivity_for_table() takes into account multi-
column selectivities from range access, it tries to take-into account
that selectivity for some columns may have been already taken into account.

For example, for range access on IDX1 using {kp1, kp2}, the selectivity
of restrictions on "kp2" might have already been taken into account
to some extent.
So, the code tries to "discount" that using rec_per_key[] estimates.

This seems to be wrong and unreliable: the "discounting" may produce a
rselectivity_multiplier number that hints that the overall selectivity
of range access on IDX1 was greater than 1.

Do a conservative fix: if we arrive at conclusion that selectivity of
range access on condition in IDX1 >1.0, clip it down to 1.
2024-10-02 10:52:09 +03:00
Sergei Golubchik
9021f40b8e MDEV-35050 Found wrong usage of mutex upon setting plugin session variables 2024-10-01 18:29:11 +02:00
Marko Mäkelä
464055fe65 MDEV-34078 Memory leak in InnoDB purge with 32-column PRIMARY KEY
row_purge_reset_trx_id(): Reserve large enough offsets for accomodating
the maximum width PRIMARY KEY followed by DB_TRX_ID,DB_ROLL_PTR.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-10-01 18:35:39 +03:00
Thirunarayanan Balathandayuthapani
cc810e64d4 MDEV-34392 Inplace algorithm violates the foreign key constraint
Don't allow the referencing key column from NULL TO NOT NULL
when

 1) Foreign key constraint type is ON UPDATE SET NULL
 2) Foreign key constraint type is ON DELETE SET NULL
 3) Foreign key constraint type is UPDATE CASCADE and referenced
 column declared as NULL

Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values

get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.

fk_check_column_changes(): Enforce the above rules for COPY
algorithm

innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation

innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm

dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
2024-10-01 09:41:56 +05:30
Marko Mäkelä
d28ac3f82d MDEV-34207: ALTER TABLE...STATS_PERSISTENT=0 fails to drop statistics
commit_try_norebuild(): If the STATS_PERSISTENT attribute of the table
is being changed to disabled, drop the persistent statistics of the table.
2024-09-30 15:27:38 +03:00
Julius Goryavsky
95d285fb75 MDEV-30307 addendum: support for compilation in release mode 2024-09-30 00:33:23 +02:00
sjaakola
cf0c3ec274 MDEV-30307 KILL command inside a transaction causes problem for galera replication
Added new test scenario in galera.galera_bf_kill
test to make the issue surface. The tetst scenario has
a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is
replicated, which causes BF abort for the KILL command
processing. Handling BF abort rollback while executing
KILL command causes node hanging, in this scenario.

sql_kill() and sql_kill_user() functions have now fix,
to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution
will not happen inside transaction context anymore.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-27 19:26:26 +02:00
Sergei Petrunia
d0a6a7886b MDEV-25822 JSON_TABLE: default values should allow non-string literals
(Polished initial patch by Alexey Botchkov)
Make the code handle DEFAULT values of any datatype

- Make Json_table_column::On_response::m_default be Item*, not LEX_STRING.
- Change the parser to use string literal non-terminals for producing
  the DEFAULT value
-- Also, stop updating json_table->m_text_literal_cs for the DEFAULT
   value literals as it is not used.
2024-09-27 13:52:17 +03:00
Marko Mäkelä
6acada713a MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.

We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").

We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.

Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.

The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().

Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.

It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.

::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.

xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
2024-09-26 18:47:12 +03:00
Jan Lindström
024e95128b MDEV-32996 : galera.galera_var_ignore_apply_errors -> [ERROR] WSREP: Inconsistency detected
Add wait_until_ready waits after wsrep_on is set on again to
make sure that node is ready for next step before continuing.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-26 00:04:56 +02:00
Jan Lindström
0ce5603b86 MDEV-33035 : Galera test case MDEV-16509 unstable
Stabilize test by reseting DEBUG_SYNC and add wait_condition
for expected table contents.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-26 00:04:56 +02:00
Sergei Golubchik
dd1cad7e5f galera_3nodes.MDEV-29171 fails
set transferfmt in .cnf file like other galera tests do.
otherwise it defaults to socat when mtr detected that only nc is available
2024-09-24 14:30:24 +02:00
Denis Protivensky
231900e5bb MDEV-34836: TOI on parent table must BF abort SR in progress on a child
Applied SR transaction on the child table was not BF aborted by TOI running
on the parent table for several reasons:

Although SR correctly collected FK-referenced keys to parent, TOI in Galera
disregards common certification index and simply sets itself to depend on
the latest certified write set seqno.

Since this write set was the fragment of SR transaction, TOI was allowed to
run in parallel with SR presuming it would BF abort the latter.

At the same time, DML transactions in the server don't grab MDL locks on
FK-referenced tables, thus parent table wasn't protected by an MDL lock from
SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction.

In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not
enough to trigger MDL conflict in Galera.

InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to
the fact that it was believed MDL locking should always produce conflicts correctly.

The fix brings conflict resolution rules similar to MDL-level checks to InnoDB,
thus accounting for the problematic case.

Apart from that, wsrep_thd_is_SR() is patched to return true only for executing
SR transactions. It should be safe as any other SR state is either the same as
for any single write set (thus making the two logically equivalent), or it reflects
an SR transaction as being aborting or prepared, which is handled separately in
BF-aborting logic, and for regular execution path it should not matter at all.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-24 11:14:01 +02:00
Marko Mäkelä
971cf59579 Merge 10.6 into 10.11 2024-09-24 08:49:20 +03:00
mariadb-DebarunBanerjee
35d477dd1d MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1
The issue is caused by a race between buf_page_create_low getting the
page from buffer pool hash and buf_LRU_free_page evicting it from LRU.

The issue is introduced in 10.6 by MDEV-27058
commit aaef2e1d8c
MDEV-27058: Reduce the size of buf_block_t and buf_page_t

The solution is buffer fix the page before releasing buffer pool mutex
in buf_page_create_low when x_lock_try fails to acquire the page latch.
2024-09-20 20:26:43 +05:30
Julius Goryavsky
cb83ae210c galera mtr suite: fixes for unstable tests 2024-09-19 09:02:46 +02:00
Lena Startseva
0a5e4a0191 MDEV-31005: Make working cursor-protocol
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"

Fix for v.10.5
2024-09-18 18:39:26 +07:00
Brandon Nesterenko
68938d2b42 MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail
The failing test case validates Seconds_Behind_Master for a delayed
slave, while STOP SLAVE is executed during a delay. The test fixes
initially added to the test (commit b04c857596) added a table lock
to ensure a transaction could not finish before validating the
Seconds_Behind_Master field after SLAVE START, but did not address a
possibility that the transaction could finish before running the
STOP SLAVE command, which invalidates the validations for the rest
of the test case. Specifically, this would result in 1) a timeout in
“Waiting for table metadata lock” on the replica, which expects the
transaction to retry after slave restart and hit a lock conflict on
the locked tables (added in b04c857596), and 2) that
Seconds_Behind_Master should have increased, but did not.

The failure can be reproduced by synchronizing the slave to the master
before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).

This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
synchronize a MASTER_DELAY, rather than continually increase the
duration of the delay each time the test fails on buildbot. This is
to ensure that on slow machines, a delay does not pass before the
test gets a chance to validate results. Additionally, it decreases
overall test time because the test can continue immediately after
validation, thereby bypassing the remainder of a full delay for each
transaction.
2024-09-17 06:29:20 -06:00
Julius Goryavsky
f176248d4b Merge branch '10.6' into '10.11' 2024-09-17 06:23:10 +02:00
Julius Goryavsky
80fff4c6b1 Merge branch '10.5' into '10.6' 2024-09-16 16:39:59 +02:00
Marko Mäkelä
b187414764 Merge 10.6 into 10.11 2024-09-16 10:58:40 +03:00
Julius Goryavsky
7ee0e60bbb galera mtr tests: minor fixes to make tests more reliable 2024-09-15 05:05:03 +02:00
Dave Gosselin
95885261f0 MDEV-27037 mysqlbinlog emits a warning when reaching EOF before stop-datetime
Emit a warning in the event that we finished processing input files
before reaching the boundary indicated by --stop-datetime.
2024-09-12 08:43:29 -04:00
Dave Gosselin
242b67f1de MDEV-27037 mysqlbinlog emits a warning when reaching EOF before stop-condition
Emit a warning in the event that we finished processing input files
before reaching the boundary indicated by --stop-position.
2024-09-12 08:43:29 -04:00
Marko Mäkelä
a74bea7ba9 MDEV-34879 InnoDB fails to merge the change buffer to ROW_FORMAT=COMPRESSED tables
buf_page_t::read_complete(): Fix an incorrect condition that had been
added in commit aaef2e1d8c (MDEV-27058).
Also for compressed-only pages we must remember that buffered changes
may exist.

buf_read_page(): Correct the function comment; this is for a synchronous
and not asynchronous read. Pass the parameter unzip=true to
buf_read_page_low(), because each of our callers will be interested in
the uncompressed page frame. This will cause the test
encryption.innodb-compressed-blob to emit more errors when the
correct keys for decrypting the clustered index root page are unavailable.

Reviewed by: Debarun Banerjee
2024-09-12 10:52:55 +03:00
Yuchen Pei
a8c5717223
Merge branch '10.6' into 10.11 2024-09-12 10:44:13 +10:00
Monty
3ae4ecbfc5 MDEV-34867 engine S3 cause 500 error for huawei buckets
Add support for removing the Content-Type header to the S3 engine. This
is required for compatibility with some S3 providers.

This also adds a provider option to the S3 engine which will turn on
relevant compatibility options for specific providers.

This was required for getting MariaDB S3 engine to work with "Huawei
Cloud S3".
To get Huawei S3 storage to work on has set one of the following
S3 options:
s3_provider=Huawei
s3_ssl_no_verify=1

Author: Andrew Hutchings <andrew@mariadb.org>
2024-09-11 16:15:37 +03:00
Yuchen Pei
b168859d1e
Merge branch '10.6' into 10.11 2024-09-11 16:10:53 +10:00
Daniel Black
2496779d69 MDEV-34617 galera.galera_ist_mariabackup_verify_ca fails on FreeBSD
Was failing because innodb-log-file-buffering is a Linux/Windows only
variable.

This was introduced in MDEV-33787 to enforce O_DIRECT on Linux.
2024-09-10 16:08:39 +02:00
Marko Mäkelä
852d42e993 MDEV-34483 Backup may copy unnecessarily much log
In mariadb-backup --backup there are multiple mechanisms for ensuring that
a sufficient amount of the InnoDB write-ahead log (ib_logfile0) is being
copied at the end of the backup. The backup needs to include the latest
committed transaction. While further transaction commits are blocked by
BACKUP STAGE BLOCK_COMMIT, ongoing transactions may modify the database
contents and write log records. We were unnecessarily copying such log,
which would also cause further effort of rolling back incomplete
transactions after the backup is restored.

backup_wait_for_lsn(): Declare as static, and refactor some code
to separate functions backup_wait_for_lsn_low() and
backup_wait_timeout().

backup_wait_for_commit_lsn(): A new function to determine the current
LSN (within BACKUP STAGE BLOCK_COMMIT) and to wait for the log to be
copied until that. Invoked by BackupStages::stage_block_commit().

xtrabackup_backup_func(): Remove a condition that had already been
checked by a caller of backup_wait_timeout().

server_lsn_after_lock: Declare as a local variable in
BackupStages::stage_block_ddl().

log_copying_thread(), io_watching_thread(): Use metadata_last_lsn
instead of metadata_to_lsn as the stop condition.

BackupStages::stage_block_commit(): Ensure that the log tables
(in particular, mysql.general_log) will have been copied before
the BACKUP STAGE BLOCK_COMMIT is being followed by any further
SQL statements.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
2024-09-09 16:47:35 +03:00
Marko Mäkelä
b7b2d2bde4 Merge 10.5 into 10.6 2024-09-09 11:30:30 +03:00
Yuchen Pei
d002b1f503
Merge branch '10.6' into 10.11 2024-09-09 11:34:19 +10:00
Sergei Petrunia
c630e23a18 MDEV-34894: Poor query plan, because range estimates are not reused for ref(const)
(Variant 4, with @@optimizer_adjust_secondary_key_costs, reuse in two
places, and conditions are replaced with equivalent simpler forms in two more)

In best_access_path(), ReuseRangeEstimateForRef-3,  the check
for whether
 "all used key_part_i used key_part_i=const"
was incorrect: it may produced a "NO" answer for cases when we
had:
 key_part1= const // some key parts are usable
 key_part2= value_not_in_join_prefix  //present but unusable
 key_part3= non_const_value // unusable due to gap in key parts.

This caused the optimizer to fail to apply ReuseRangeEstimateForRef
heuristics. The consequence is poor query plan choice when the index
in question has very skewed data distribution.

The fix is enabled if its @@optimizer_adjust_secondary_key_costs flag
is set.
2024-09-08 16:26:13 +03:00
Marko Mäkelä
f9f92b480e Merge 10.6 into 10.11 2024-09-06 16:17:42 +03:00
Marko Mäkelä
2da4839bb6 Merge 10.6 into 10.11 2024-09-06 14:45:22 +03:00
Marko Mäkelä
024a18dbcb MDEV-34823 Invalid arguments in ib_push_warning()
In the bug report MDEV-32817 it occurred that the function
row_mysql_get_table_status() is outputting a fil_space_t*
as if it were a numeric tablespace identifier.

ib_push_warning(): Remove. Let us invoke push_warning_printf() directly.

innodb_decryption_failed(): Report a decryption failure and set the
dict_table_t::file_unreadable flag. This code was being duplicated in
very many places. We return the constant value DB_DECRYPTION_FAILED
in order to avoid code duplication in the callers and to allow tail calls.

innodb_fk_error(): Report a FOREIGN KEY error.

dict_foreign_def_get(), dict_foreign_def_get_fields(): Remove.
This code was being used in dict_create_add_foreign_to_dictionary()
in an apparently uncovered code path. That ib_push_warning() call
would pass the integer i+1 instead of a pointer to NUL terminated
string ("%s"), and therefore the call should have resulted in a crash.

dict_print_info_on_foreign_key_in_create_format(),
innobase_quote_identifier(): Add const qualifiers.

row_mysql_get_table_error(): Replaces row_mysql_get_table_status().
Display no message on DB_CORRUPTION; it should be properly reported at
the SQL layer anyway.
2024-09-06 14:29:09 +03:00
Yuchen Pei
60b93cdd30
Merge branch '10.5' into 10.6 2024-09-06 13:52:57 +10:00
Thirunarayanan Balathandayuthapani
4972f9fc0f MDEV-33087 ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently
- Remove the usage of alter_algorithm variable and disable
the persistent statistics in alter_copy_bulk test case.
2024-09-05 16:24:16 +05:30
Daniel Black
d1dc70675c MDEV-34864 SHOW INDEX FROM - SEQ_IN_INDEX to ULong
MySQL-Connector-Net casts SEQ_IN_INDEX to uint and will
raise an exception if the type is a System.Int64.

As we don't support a huge number of multi-columns in
an index reducing to a uint is sufficient to represent
all values and maintain compatibility with MySQL-Connector-Net.

This matches the type (uint) returned by MySQL-8.3 and 8.0.

Reviewer: Alexander Barkov <bar@mariadb.com>
2024-09-04 17:17:32 +10:00
Julius Goryavsky
d5a669b6b6 Merge branch '10.5' into '10.6' 2024-09-03 07:44:51 +02:00
Julius Goryavsky
b3cc952916 galera tests: updated .result for galera_gtid_2_cluster test 2024-09-03 07:21:43 +02:00
Sergei Petrunia
c8d040938a MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT
(Variant 2b: call greedy_search() twice, correct handling for limited
 search_depth)

Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.

The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.

In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
2024-09-02 16:37:18 +03:00
Jan Lindström
72243bc236 MDEV-31173 : Server crashes when setting wsrep_cluster_address after adding invalid value to wsrep_allowlist table
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-02 04:28:57 +02:00