Commit graph

200474 commits

Author SHA1 Message Date
Marko Mäkelä
852d42e993 MDEV-34483 Backup may copy unnecessarily much log
In mariadb-backup --backup there are multiple mechanisms for ensuring that
a sufficient amount of the InnoDB write-ahead log (ib_logfile0) is being
copied at the end of the backup. The backup needs to include the latest
committed transaction. While further transaction commits are blocked by
BACKUP STAGE BLOCK_COMMIT, ongoing transactions may modify the database
contents and write log records. We were unnecessarily copying such log,
which would also cause further effort of rolling back incomplete
transactions after the backup is restored.

backup_wait_for_lsn(): Declare as static, and refactor some code
to separate functions backup_wait_for_lsn_low() and
backup_wait_timeout().

backup_wait_for_commit_lsn(): A new function to determine the current
LSN (within BACKUP STAGE BLOCK_COMMIT) and to wait for the log to be
copied until that. Invoked by BackupStages::stage_block_commit().

xtrabackup_backup_func(): Remove a condition that had already been
checked by a caller of backup_wait_timeout().

server_lsn_after_lock: Declare as a local variable in
BackupStages::stage_block_ddl().

log_copying_thread(), io_watching_thread(): Use metadata_last_lsn
instead of metadata_to_lsn as the stop condition.

BackupStages::stage_block_commit(): Ensure that the log tables
(in particular, mysql.general_log) will have been copied before
the BACKUP STAGE BLOCK_COMMIT is being followed by any further
SQL statements.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
2024-09-09 16:47:35 +03:00
Yuchen Pei
d002b1f503
Merge branch '10.6' into 10.11 2024-09-09 11:34:19 +10:00
Sergei Petrunia
c630e23a18 MDEV-34894: Poor query plan, because range estimates are not reused for ref(const)
(Variant 4, with @@optimizer_adjust_secondary_key_costs, reuse in two
places, and conditions are replaced with equivalent simpler forms in two more)

In best_access_path(), ReuseRangeEstimateForRef-3,  the check
for whether
 "all used key_part_i used key_part_i=const"
was incorrect: it may produced a "NO" answer for cases when we
had:
 key_part1= const // some key parts are usable
 key_part2= value_not_in_join_prefix  //present but unusable
 key_part3= non_const_value // unusable due to gap in key parts.

This caused the optimizer to fail to apply ReuseRangeEstimateForRef
heuristics. The consequence is poor query plan choice when the index
in question has very skewed data distribution.

The fix is enabled if its @@optimizer_adjust_secondary_key_costs flag
is set.
2024-09-08 16:26:13 +03:00
Monty
c41ab95a38 Remove rows and cost from optimizer trace for not usable key parts 2024-09-07 16:51:52 +03:00
Monty
886d740ad7 Optimized max_part_bit in sql_select.cc to use my_find_first_bit. 2024-09-07 16:51:52 +03:00
Marko Mäkelä
f9f92b480e Merge 10.6 into 10.11 2024-09-06 16:17:42 +03:00
Monty
f0b2e76577 Removed ctrl-l from the source 2024-09-06 15:30:18 +03:00
Marko Mäkelä
2da4839bb6 Merge 10.6 into 10.11 2024-09-06 14:45:22 +03:00
Yuchen Pei
60b93cdd30
Merge branch '10.5' into 10.6 2024-09-06 13:52:57 +10:00
Yuchen Pei
e886c2ba02
MDEV-34757 Check leaf_tables_saved in partition pruning in UPDATE and DELETE 2024-09-06 11:41:59 +10:00
Yuchen Pei
00cb344085
MDEV-33858 Assertion `(mem_root->flags & 4) == 0' fails on 2nd execution of PS with -DWITH_PROTECT_STATEMENT_MEMROOT=ON
Simply adding tests as the bug is fixed with a backport of MDEV-34447
2024-09-06 11:41:59 +10:00
Yuchen Pei
2c3e07df47
MDEV-34447: Memory leakage is detected on running the test main.ps against the server 11.1
The memory leak happened on second execution of a prepared statement
that runs UPDATE statement with correlated subquery in right hand side of
the SET clause. In this case, invocation of the method
  table->stat_records()
could return the zero value that results in going into the 'if' branch
that handles impossible where condition. The issue is that this condition
branch missed saving of leaf tables that has to be performed as first
condition optimization activity. Later the PS statement memory root
is marked as read only on finishing first time execution of the prepared
statement. Next time the same statement is executed it hits the assertion
on attempt to allocate a memory on the PS memory root marked as read only.
This memory allocation takes place by the sequence of the following
invocations:
 Prepared_statement::execute
  mysql_execute_command
   Sql_cmd_dml::execute
    Sql_cmd_update::execute_inner
     Sql_cmd_update::update_single_table
      st_select_lex::save_leaf_tables
       List<TABLE_LIST>::push_back

To fix the issue, add the flag SELECT_LEX::leaf_tables_saved to control
whether the method SELECT_LEX::save_leaf_tables() has to be called or
it has been already invoked and no more invocation required.

Similar issue could take place on running the DELETE statement with
the LIMIT clause in PS/SP mode. The reason of memory leak is the same as for
UPDATE case and be fixed in the same way.
2024-09-06 11:41:58 +10:00
Thirunarayanan Balathandayuthapani
4972f9fc0f MDEV-33087 ALTER TABLE...ALGORITHM=COPY should build indexes more efficiently
- Remove the usage of alter_algorithm variable and disable
the persistent statistics in alter_copy_bulk test case.
2024-09-05 16:24:16 +05:30
Ian Gilfillan
2ed33f2fb6 MDEV-26114: Update Sys Schema README 2024-09-05 13:24:57 +10:00
Daniel Black
8024b8e4c1 MDEV-33091 pcre2 headers - handle columnstore
From e735cf2ed7cefb2af36f10f3cb47dfc060789df3, the PCRE_INCLUDES
changed to PCRE_INCLUDE_DIRS for consistency.

The columnstore module depends on the old name.

Create a mapping for the columnstore submodule.

10.6+ fix for submodule is:
* https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/3304
2024-09-05 12:14:06 +10:00
Daniel Black
dff354e7df MDEV-34825: my_cpu.h - non-glibc ism for POWER
Taking both the FreeBSD[1] and Alpine[1] patch concepts;

provide non-GLIBC definations for HMT_*.

Provide FreeBSD ASM base for __ppc_get_timebase.
On alternately use __builtin_ppc_get_timebase which is described
on https://gcc.gnu.org/onlinedocs/gcc/Basic-PowerPC-Built-in-Functions-Available-on-all-Configurations.html
an not depended on glibc/musl.

[1] 15d22e1c70/databases/mariadb106-server/files/patch-include_my__cpu.h
[2] https://gitlab.alpinelinux.org/alpine/aports/-/blob/master/main/mariadb/ppc-remove-glibc-dep.patch
2024-09-05 12:14:06 +10:00
Piotr Kubaj
e9b70e59a3 MDEV-34825 FreeBSD - upstream riscv64 compatibility patch
From 15d22e1c70/databases/mariadb106-server/files/patch-sql_mysqld.cc
2024-09-05 12:14:06 +10:00
Piotr Kubaj
7b2b03c4f2 MDEV-34825 FreeBSD fails to build under clang natively
Upstream the patch from: 15d22e1c70/databases/mariadb106-server/files/patch-mysys_crc32_crc32c.cc
2024-09-05 12:14:06 +10:00
Sergei Golubchik
566c22e814 pcre.cmake: always check the library with check_library_exists()
even if pkg-config has it. otherwise build dependencies
aren't detected.
2024-09-05 12:14:06 +10:00
Sergei Golubchik
b2ebe1cb7b MDEV-33091 pcre2 headers aren't found on Solaris
use pkg-config to find pcre2, if possible

rename PCRE_INCLUDES to use PKG_CHECK_MODULES naming, PCRE_INCLUDE_DIRS
2024-09-05 12:14:06 +10:00
Daniel Black
2e23c7342f MDEV-34567 unit.my_apc always failing on FreeBSD-14
Without the call to my_mutex_init, the mutex attributes
my_fast_mutexattr and my_errorcheck_mutexattr are uninitialized.

Linux tolerates this but FreeBSD doesn't (and segfaults).

We fix for all since the unit text should be testing the
standard mutexes of the system.
2024-09-05 12:14:06 +10:00
Daniel Black
c991efd9c3 MDEV-34825 FreeBSD fails to build under clang natively
clang doesn't have /usr/local/lib in the path. As such
there are various depedency linkages that will fail.

For example pcre and libfmt.`
2024-09-05 12:14:06 +10:00
Marko Mäkelä
fe5829a121 MDEV-34446 SIGSEGV on SET GLOBAL innodb_log_file_size with memory-mapped log file
log_t::resize_write(): Advance log_sys.resize_lsn and reset
the resize_log offset to START_OFFSET whenever the memory-mapped buffer
would wrap around.

Previously, in case the initial target offset would be beyond the
requested innodb_log_file_size, we only adjusted the offset but
not the LSN. An incorrect LSN would cause log_sys.buf_free to be out
of bounds when the log resizing completes.

The log_sys.lsn_lock will cover the entire duration of replicating
memory-mapped log for resizing. We just need a mutex that is compatible
with the caller holding log_sys.latch. While the choice of mtr_t::finisher
(for normal log writes) depends on mtr_t::spin_wait_delay,
replicating the log during resizing is a rare operation where we can
afford possible additional context switching overhead.
2024-09-04 14:24:30 +03:00
Daniel Black
d1dc70675c MDEV-34864 SHOW INDEX FROM - SEQ_IN_INDEX to ULong
MySQL-Connector-Net casts SEQ_IN_INDEX to uint and will
raise an exception if the type is a System.Int64.

As we don't support a huge number of multi-columns in
an index reducing to a uint is sufficient to represent
all values and maintain compatibility with MySQL-Connector-Net.

This matches the type (uint) returned by MySQL-8.3 and 8.0.

Reviewer: Alexander Barkov <bar@mariadb.com>
2024-09-04 17:17:32 +10:00
Marko Mäkelä
9f0b106631 MDEV-34845 buf_flush_buffer_pool(): Assertion `!os_aio_pending_reads()' failed
buf_flush_buffer_pool(): Wait for any pending asynchronous reads
to complete. This assertion failed in a run where buf_read_ahead_linear()
had been triggered in an SQL statement that was executed right
before shutdown.

Reviewed by: Debarun Banerjee
2024-09-03 18:22:10 +03:00
Marko Mäkelä
9878238f74 MDEV-34791: Redundant page lookups hurt performance
btr_cur_t::search_leaf(): When the index root page is also a leaf page,
we may need to upgrade our existing shared root page latch into an
exclusive latch. Even if we end up waiting, the root page won't be able
to go away while we hold an index()->lock. The index page may be split;
that is all.

btr_latch_prev(): Acquire the page latch while holding a buffer-fix
and an index tree latch. Merge the change buffer if needed. Use
buf_pool_t::page_fix() for this special case instead of complicating
buf_page_get_low() and buf_page_get_gen().

row_merge_read_clustered_index(): Remove some code that does not seem
to be useful. No difference was observed with regard to removing this
code when a CREATE INDEX or OPTIMIZE TABLE statement was run concurrently
with sysbench oltp_update_index --tables=1 --table_size=1000 --threads=16.

buf_pool_t::unzip(): Decompress a ROW_FORMAT=COMPRESSED page.

buf_pool_t::page_fix(): Handle also ROW_FORMAT=COMPRESSED pages
as well as change buffer merge. Optionally return an error.
Add a flag for suppressing a page latch wait and a special return
value -1 to indicate that the call would block.
This is the preferred way of buffer-fixing blocks.
The functions buf_page_get_gen() and buf_page_get_low() are only being
invoked with rw_latch=RW_NO_LATCH in operations on SPATIAL INDEX.

buf_page_t: Define some static functions for interpreting state().

buf_page_get_zip(), buf_read_page(),
buf_read_ahead_random(), buf_read_ahead_linear():
Remove the redundant parameter zip_size. We must look up the
tablespace and can invoke fil_space_t::zip_size() on it.

buf_page_get_low(): Require mtr!=nullptr.

buf_page_get_gen(): Implement some lock downgrading during recovery.

ibuf_page_low(): Use buf_pool_t::page_fix() in a debug check.
We do wait for a page read here, because otherwise a debug assertion in
buf_page_get_low() in the test innodb.ibuf_delete could occasionally fail.

PageConverter::operator(): Invoke buf_pool_t::page_fix() in order
to possibly evict a block. This allows us to remove some
special case code from buf_page_get_low().
2024-09-03 14:15:57 +03:00
Denis Protivensky
4e2c02a12c MDEV-33133: MDL conflict handling code should skip BF-aborted trxs
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.

The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-03 07:45:57 +02:00
Julius Goryavsky
d5a669b6b6 Merge branch '10.5' into '10.6' 2024-09-03 07:44:51 +02:00
Julius Goryavsky
b3cc952916 galera tests: updated .result for galera_gtid_2_cluster test 2024-09-03 07:21:43 +02:00
Sergei Petrunia
c8d040938a MDEV-34720: Poor plan choice for large JOIN with ORDER BY and small LIMIT
(Variant 2b: call greedy_search() twice, correct handling for limited
 search_depth)

Modify the join optimizer to specifically try to produce join orders that
can short-cut their execution for ORDER BY..LIMIT clause.

The optimization is controlled by @@optimizer_join_limit_pref_ratio.
Default value 0 means don't construct short-cutting join orders.
Other value means construct short-cutting join order, and prefer it only
if it promises speedup of more than #value times.

In Optimizer Trace, look for these names:
* join_limit_shortcut_is_applicable
* join_limit_shortcut_plan_search
* join_limit_shortcut_choice
2024-09-02 16:37:18 +03:00
Sergei Petrunia
819765a47d Code cleanup in test_if_skip_sort_order()
Added comments
Moved a part into find_indexes_matching_order().
2024-09-02 16:37:18 +03:00
Jan Lindström
72243bc236 MDEV-31173 : Server crashes when setting wsrep_cluster_address after adding invalid value to wsrep_allowlist table
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-02 04:28:57 +02:00
Julius Goryavsky
d058be62b8 Merge branch '10.6' into '10.11' 2024-09-02 03:49:03 +02:00
Jan Lindström
a50a5e0f3b MDEV-34647 : 'INSERT...SELECT' on MyISAM table suddenly replicated by Galera
Replication of MyISAM and Aria DML is experimental and best
effort only. Earlier change make INSERT SELECT on both
MyISAM and Aria to replicate using TOI and STATEMENT
replication. Replication should happen only if user
has set needed wsrep_mode setting.

Note: This commit contains additional changes compared
to those already made for the 10.5 branch.

+ small refactoring after main fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-02 00:13:05 +02:00
Denis Protivensky
235f33e360 MDEV-33133: MDL conflict handling code should skip BF-aborted trxs
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.

The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 16:19:59 +02:00
Jan Lindström
b1f7522170 MDEV-34841 : Enable working Galera tests
* Fixes galera.galera_bf_kill_debug test case.
* Enable galera_ssl_upgrade, galera_ssl_reload, galera_pc_bootstrap
* Add MDEV to disabled tests that miss it

P.S. This commit contains additional changes compared
to the similar commit for 10.5 branch.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 07:00:14 +02:00
Denis Protivensky
1c48950e1f MDEV-30536: Fix Galera bulk insert optimization MTR test
After closing https://github.com/codership/galera-bugs/issues/947,
Galera now correctly certifies table-level keys, which made bulk
insert work again.

The corresponding MTR test is made deterministic and re-enabled.

Requires Galera 26.4.19

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 06:56:25 +02:00
Julius Goryavsky
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
Jan Lindström
7e748d075b MDEV-34841 : Enable working Galera tests
* Fixes galera.galera_bf_kill_debug test case.
* Enable galera_ssl_upgrade, galera_ssl_reload, galera_pc_bootstrap
* Add MDEV to disabled tests that miss it

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 03:51:10 +02:00
Jan Lindström
dd64f29d6b MDEV-33897 : Galera test failure on galera_3nodes.galera_gtid_consistency
Based on logs SST was started before donor reached
Primaty state. Add wait_conditions to make sure that
nodes reach Primary state before starting next node.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 03:01:37 +02:00
Alexey Yurchenko
83196a7b23 Add a basic MTR test for DDL error voting to ensure that all DDLs
generate consistent error messages,

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:58:34 +02:00
Alexey Yurchenko
731a5aba0b Use only MySQL code for TOI error vote
For TOI events specifically we have a situation where in case of the
same error different nodes may generate different messages. This may
be for two reasons:
 - different locale setting between the current client session and
   server default (we can reasonably require server locales to be
   identical on all nodes, but user can change message locale for the
   session)
 - non-deterministic course of STATEMENT execution e.g. for ALTER TABLE

On the other hand we may reasonably expect TOI event failures since
they are executed after replication, so we must ensure that voting is
consistent. For that purpose error codes should be sufficiently unique
and deterministic for TOI event failures as DDLs normally deal with
a single object, so we can merely use MySQL error codes to vote on.

Notice that this problem does not happen with regular transactional
writesets, since the originator node will always vote success and
replica nodes are assumed to have the same global locale setting.
As such different error messages indicate different errors even if
the error code is the same (e.g. ER_DUP_KEY can happen on different
rows tables).

Use only MySQL error code (without the error message) for error voting
in case of TOI event failure.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:58:27 +02:00
Alexey Yurchenko
7119149f83 If donor loop receives unknown signal from the SST script it is an
error condition (SST failure), so it should set error code before
exiting.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:54:05 +02:00
Alexey Yurchenko
69c6cb5dc4 Fix recovering state GTID in case log file contains non-text bytes -
use grep with -a option.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:52:52 +02:00
Teemu Ollakka
54a10a4293 MDEV-32363 Shut down Galera networking and logging on fatal signal
When handling fatal signal, shut down Galera networking
before printing out stack trace and writing core file.
This is to achieve fail-silent semantics on crashes which may
keep the process running for a long time, but not fully responding
e.g. due to core dumping or symbol resolving.

Also suppress all Galera/wsrep logging to avoid logging from
background threads to garble crash information from signal handler.

Notice that for fully fail-silent crash, Galera 26.4.19 is needed.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:48:19 +02:00
Julius Goryavsky
b65bbb2fae MDEV-34647: small refactoring after main fix 2024-08-30 21:50:33 +02:00
Igor Babaev
74d7168765 MDEV-25084 Assertion failure when moving equality from having to where
This bug was fixed by the patch for bug MDEV-26402.
Only a test case that failed before this patch was applied is added
in this commit.
2024-08-30 10:28:28 -07:00
Marko Mäkelä
984606d747 MDEV-34750 SET GLOBAL innodb_log_file_size is not crash safe
The recent commit 4ca355d863 (MDEV-33894)
caused a serious regression for online InnoDB ib_logfile0 resizing,
breaking crash-safety unless the memory-mapped log file interface is
being used. However, the log resizing was broken also before this.

To prevent such regressions in the future, we extend the test
innodb.log_file_size_online with a kill and restart of the server
and with some writes running concurrently with the log size change.
When run enough many times, this test revealed all the bugs that
are being fixed by the code changes.

log_t::resize_start(): Do not allow the resized log to start before
the current log sequence number. In this way, there is no need to
copy anything to the first block of resize_buf. The previous logic
regarding that was incorrect in two ways. First, we would have to
copy from the last written buffer (buf or flush_buf). Second, we failed
to ensure that the mini-transaction end marker bytes would be 1
in the buffer. If the source ib_logfile0 had wrapped around an odd number
of times, the end marker would be 0. This was occasionally observed
when running the test innodb.log_file_size_online.

log_t::resize_write_buf(): To adjust for the resize_start() change,
do not write anything that would be before the resize_lsn.
Take the buffer (resize_buf or resize_flush_buf) as a parameter.
Starting with commit 4ca355d863
we no longer swap buffers when rewriting the last log block.

log_t::append(): Define as a static function; only some debug
assertions need to refer to the log_sys object.

innodb_log_file_size_update(): Wake up the buf_flush_page_cleaner()
if needed, and wait for it to complete a batch while waiting for
the log resizing to be completed. If the current LSN is behind the
resize target LSN, we will write redundant FILE_CHECKPOINT records to
ensure that the log resizing completes. If the buf_pool.flush_list is
empty or the buf_flush_page_cleaner() is stuck for some reason, our wait
will time out in 5 seconds, so that we can periodically check if the
execution of SET GLOBAL innodb_log_file_size was aborted. Previously,
we could get into a busy loop here while the buf_flush_page_cleaner()
would remain idle.
2024-08-29 14:53:08 +03:00
Jan Lindström
9091afdc55 MDEV-31173 : Server crashes when setting wsrep_cluster_address after adding invalid value to wsrep_allowlist table
Problem was that wsrep_schema tables were not marked as
category information. Fix allows access to wsrep_schema
tables even when node is detached.

This is 10.4-10.9 version of fix.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-29 13:41:23 +02:00
Jan Lindström
b1d74b7e72 MDEV-33997 : Assertion `((WSREP_PROVIDER_EXISTS_ && this->variables.wsrep_on) && wsrep_emulate_bin_log) || mysql_bin_log.is_open()' failed in int THD::binlog_write_row(TABLE*, bool, const uchar*)
Problem was that we did not found that table was partitioned
and then we should find what is actual underlaying storage
engine.

We should not use RSU for !InnoDB tables.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-08-29 13:41:23 +02:00