Commit graph

200139 commits

Author SHA1 Message Date
Andrei
c37b2a9f04 MDEV-30460 rpl.rpl_start_alter_restart_slave sometimes fails in BB with result length mismatch
The test was used to fail because of lacking a synchronization point
designating an expected for processing "CA_1" event has been indeed
taken into this phase.
The event apparently may not have even arrived at slave.

Fixed with deploying the missed synchronization.
2024-06-17 19:06:34 +03:00
Marko Mäkelä
346a0c1402 Merge 10.6 into 10.11 2024-06-17 09:08:07 +03:00
Marko Mäkelä
e60acae655 Merge 10.5 into 10.6 2024-06-17 08:40:07 +03:00
Monty
956bcf8f49 Change mysqldump to use DO instead of 'SELECT' for storing sequences.
This avoids a lot of SETVAL() results when applying a mysqldump with
sequences.
2024-06-16 10:51:33 +03:00
Monty
fef32fd9ad MDEV-34406 Enhance mariadb_upgrade to print failing query in case of error
To make this possible, it was also necessary to enhance the mariadb
client with the option --print-query-on-error.
This option can also be very useful when running a batch of queries
through the mariadb client and one wants to find out where things goes
wrong.

TODO: It would be good to enhance mariadb_upgrade to not call the mariadb
client for executing queries but instead do this internally.  This
would have made this patch much easier!

Reviewed by: Sergei Golubchik <serg@mariadb.com>
2024-06-16 10:51:33 +03:00
Marko Mäkelä
4b4c371fe7 MDEV-34297 fixup: -Wconversion on 32-bit 2024-06-14 13:21:19 +03:00
Thirunarayanan Balathandayuthapani
3271588bb7 MDEV-34381 During innodb_undo_truncate=ON recovery, InnoDB may fail to shrink undo* files
- During recovery, InnoDB may fail to shrink the undo tablespaces
when there are no pages to recover while applying the redo log.
This issue exists only when innodb_undo_truncate is enabled.
trx_lists_init_at_db_start() could've applied the redo logs
for undo tablespace page0.
2024-06-14 12:46:02 +05:30
Marko Mäkelä
32202c30bc Merge 10.5 into 10.6 2024-06-13 19:58:11 +03:00
Marko Mäkelä
c849952b71 MDEV-33840: Fix GCC -Wreorder
This fixes up the merge commit 829cb1a49c
2024-06-13 19:57:40 +03:00
Marko Mäkelä
dd13243b0d MDEV-33161 fixup: CMAKE_CXX_FLAGS=-DEXTRA_DEBUG 2024-06-13 19:42:18 +03:00
Marko Mäkelä
5b89cab44f Merge 10.6 into 10.11 2024-06-13 08:16:49 +03:00
Brandon Nesterenko
d3a7e46bb4 MDEV-34365: UBSAN runtime error: call to function io_callback(tpool::aiocb*)
On an UBSAN clang-15 build, if running with UBSAN option
halt_on_error=1 (the issue doesn't show up without it),
MTR fails during mysqld --bootstrap with UBSAN error:

call to function io_callback(tpool::aiocb*) through pointer to incorrect function type 'void (*)(void *)'

This patch corrects the parameter type of io_callback
to match its expected type defined by callback_func,
i.e. (void*).

Reviewed By:
============
<TODO>
2024-06-12 08:39:41 +03:00
Marko Mäkelä
fc9005adc4 Merge 10.5 into 10.6 2024-06-12 07:51:28 +03:00
Thirunarayanan Balathandayuthapani
5b39ded713 MDEV-34156 InnoDB fails to apply the redo log for compressed tablespace
Problem:
=======
During recovery, InnoDB fails to apply the redo log for
compressed tablespace. The reason is that InnoDB assumes
that pages has been freed while applying the redo log for it.
During multiple scan of redo logs, InnoDB stores the freed
page information when it have sufficient buffer pool pages.
Once it ran out of memory, InnoDB doesn't store freed page
information. But InnoDB assigns the freed page ranges to tablespace
in recv_init_crash_recovery_spaces() even though
InnoDB doesn't have complete freed range information.
While applying the redo log, InnoDB wrongly assumes that page
has been freed and it could lead to corruption of tablespace.
This issue is caused by commit 941af1fa58 (MDEV-31803)
and commit 2f9e264781 (MDEV-29911).

Solution:
========
During recovery, set recovery size and freed page information
for all tablespace irrespective of memory.
2024-06-11 16:14:36 +05:30
Marko Mäkelä
b81d717387 Merge 10.6 into 10.11 2024-06-11 12:50:10 +03:00
Vladislav Vaintroub
f2eda61579 MDEV-33616 workaround libmariadb bug : mysql_errno = 0 on failed connection
The bug can happens on macOS, if server closes the socket without sending
error packet to client. Closing the socket on server side is legitimate,
and happen e.g when write timeout occurs, perhaps also other situations.

However mysqltest is not prepared to handle mysql_errno 0, and erroneously
thinks connection was successfully established.

The fix/workaround in mysqltest is to treat client failure with
mysql_errno 0 the same as CR_SERVER_LOST (generic client-side
communication error)

The real fix in client library would ensure that mysql_errno is set
on errors.
2024-06-11 09:15:32 +02:00
Yuchen Pei
d524cb5b3d
MDEV-34002 Initialise fields in spider_db_handler
Otherwise it may result in nonsensical values like 190 for a boolean.
2024-06-11 09:18:42 +10:00
Sergei Golubchik
40dd5b8676 fix the test for --view 2024-06-10 21:03:53 +02:00
Dave Gosselin
90d376e017 MDEV-34129 mariadb-install-db appears to hang on macOS
Immediately close down the signal handler loop when we decide to
break connections as it's the start of process termination
anyway, and there's no need to wait once we've invoked break_connections.
2024-06-10 15:00:10 -04:00
Brandon Nesterenko
fcd21d3e40 MDEV-34355: rpl.rpl_semi_sync_no_missed_ack_after_add_slave ‘server_3 should have sent…’
The problem is that the test could query the status variable
Rpl_semi_sync_slave_send_ack before the slave actually updated it.
This would result in an immediate --die assertion killing the rest
of the test. The bottom of this commit message has a small patch
that can be applied to reproduce the test failure.

This patch fixes the test failure by waiting for the variable to be
updated before querying its value.

diff --git a/sql/semisync_slave.cc b/sql/semisync_slave.cc
index 9ddd4c5c8d7..60538079fce 100644
--- a/sql/semisync_slave.cc
+++ b/sql/semisync_slave.cc
@@ -303,7 +303,10 @@ int Repl_semi_sync_slave::slave_reply(Master_info *mi)
     reply_res= DBUG_EVALUATE_IF("semislave_failed_net_flush", 1,
                                 net_flush(net));
     if (!reply_res)
+    {
+      sleep(1);
       rpl_semi_sync_slave_send_ack++;
+    }
   }
   DBUG_RETURN(reply_res);
 }
2024-06-10 12:27:20 -06:00
Alexander Barkov
3b80d23d02 mtr --skip-not-found did not skip suites
--skip-not-found switch tells mtr to skip not found tests instead of aborting.
But it failed to skip the test if the suite name was not found.

This problem also made the *last-N-failed builbot builders fail
to run `mtr --skip-not-found` if the last commit removed a file in
the mysql-test/include/ directory.

This commit fixes it, now the not found test is properly skipped,
no matter what component of the test name was not found:

$ ./mtr main.foo --skip-not-found foo.main
...
==============================================================================
TEST                                  WORKER RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
foo.main                                 [ skipped ]  not found
main.foo                                 [ skipped ]  not found
--------------------------------------------------------------------------
2024-06-10 19:17:00 +02:00
Marko Mäkelä
27834ebc91 Merge 10.5 into 10.6 2024-06-10 15:22:15 +03:00
Marko Mäkelä
a2bd936c52 MDEV-33161 Function pointer signature mismatch in LF_HASH
In cmake -DWITH_UBSAN=ON builds with clang but not with GCC,
-fsanitize=undefined will flag several runtime errors on
function pointer mismatch related to the lock-free hash table LF_HASH.

Let us use matching function signatures and remove function pointer
casts in order to avoid potential bugs due to undefined behaviour.

These errors could be caught at compilation time by
-Wcast-function-type-strict, which is available starting with clang-16,
but not available in any version of GCC as of now. The old GCC flag
-Wcast-function-type is enabled as part of -Wextra, but it specifically
does not catch these errors.

Reviewed by: Vladislav Vaintroub
2024-06-10 12:35:33 +03:00
Alexander Barkov
246c0b3a35 MDEV-34227 On startup: UBSAN: runtime error: applying non-zero offset in JOIN::make_aggr_tables_info in sql/sql_select.cc
Avoid undefined behaviour (applying offset to nullptr).
The reported scenario is covered in mysql-test/connect-no-db.test
No new tests needed.
2024-06-10 12:50:52 +04:00
Alexander Barkov
21f56583bf MDEV-32376 SHOW CREATE DATABASE statement crashes the server when db name contains some unicode characters, ASAN stack-buffer-overflow
Adding the test for the length of lex->name into show_create_db().

Without this test writes beyond the end of db_name_buff were possible
upon a too long database name.
2024-06-10 09:31:14 +04:00
Brandon Nesterenko
bf0aa99aeb MDEV-34237: On Startup: UBSAN: runtime error: call to function MDL_lock::lf_hash_initializer lf_hash_insert through pointer to incorrect function type 'void (*)(st_lf_hash *, void *, const void *)'
A few different incorrect function type UBSAN issues have been
grouped into this patch.

The only real potentially undefined behavior is an error about
show_func_mutex_instances_lost, which when invoked in
sql_show.cc::show_status_array(), puts 5 arguments onto the stack;
however, the implementing function only actually has 3 parameters (so
only 3 would be popped). This was fixed by adding in the remaining
parameters to satisfy the type mysql_show_var_func.

The rest of the findings are pointer type mismatches that wouldn't
lead to actual undefined behavior. The lf_hash_initializer function
type definition is

typedef void (*lf_hash_initializer)(LF_HASH *hash, void *dst, const void *src);

but the MDL_lock and table cache's implementations of this function
do not have that signature. The MDL_lock has specific MDL object
parameters:

static void lf_hash_initializer(LF_HASH *hash __attribute__((unused)),
                                MDL_lock *lock, MDL_key *key_arg)

and the table cache has specific TDC parameters:

static void tdc_hash_initializer(LF_HASH *,
                                 TDC_element *element, LEX_STRING *key)

leading to UBSAN runtime errors when invoking these functions.

This patch fixes these type mis-matches by changing the
implementing functions to use void * and const void * for their
respective parameters, and later casting them to their expected
type in the function body.

Note too the functions tdc_hash_key and tc_purge_callback had
a similar problem to tdc_hash_initializer and was fixed
similarly.

Reviewed By:
============
Sergei Golubchik <serg@mariadb.com>
2024-06-08 19:59:59 -06:00
Julius Goryavsky
0d85c905c4 MDEV-34269: post-fix code simplification
The code is slightly simplified taking into account
the fact that partition_ht() always returns a normal
hton when there is no partitioning.
2024-06-07 18:26:08 +02:00
Jan Lindström
0172887980 MDEV-34269 : 10.11.8 cluster becomes inconsistent when using composite primary key and partitioning
This is regression from commit 3228c08fa8. Problem is that
when table storage engine is determined there should be
check is table partitioned and if it is then determine
partition implementing storage engine.

Reported bug is reproducible only with --log-bin so make
sure tests changed by 3228c08fa8 and new test are run
with --log-bin and binlog disabled.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-07 18:26:08 +02:00
Marko Mäkelä
e255837eaf MDEV-34266 safe_strcpy() includes an unnecessary conditional branch
The strncpy() wrapper that was introduced in
commit 567b681299
is checking whether the output was truncated even in cases
where the caller does not care about it.

Let us introduce a separate function safe_strcpy_truncated() that
indidates whether the output was truncated.
2024-06-07 19:24:36 +03:00
Thirunarayanan Balathandayuthapani
4b4dbb23ea MDEV-34169 Don't allow innodb_open_files to be lesser than
number of non-user tablespace.

fil_space_t::try_to_close(): Don't try to close
the tablespace which is acquired by the caller of
the function

Added the suppression message in open_files_limit test case
2024-06-07 20:50:39 +05:30
Oleksandr Byelkin
77c4c0f256 MDEV-34203 Sandbox mode \- is not compatible with --binary-mode
"Process" sandbox short command put by masqldump to avoid an error.
2024-06-07 14:07:54 +02:00
Marko Mäkelä
d9d0e8fd60 MDEV-34321: call to crc32c_3way through pointer to incorrect function type
In commit 9ec7819c58 the CRC-32 function
signatures had been unified somewhat, but not enough.

clang -fsanitize=undefined would flag a function pointer signature
mismatch between const char* and const void*, but not between
uint32_t and unsigned. We try to fix both inconsistencies anyway.

Reviewed by: Vladislav Vaintroub
2024-06-07 13:51:46 +03:00
Thirunarayanan Balathandayuthapani
b7a75fbb8a MDEV-34169 Don't allow innodb_open_files to be lesser than
number of non-user tablespace.

- InnoDB only closes the user tablespace when the number of open
files exceeds innodb_open_files limit. In that case, InnoDB should
make sure that innodb_open_files value should be greater
than number of undo tablespace, system and temporary tablespace files.
2024-06-07 15:37:11 +05:30
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Julius Goryavsky
238798d978 MDEV-32158: wsrep_sst_mariabackup use /tmp dir during SST rather then user defined tmpdir
wsrep_sst_mariabackup should use the tmpdir defined by
the user under the '[mysqld]' section of the configuration
file rather than the default '/tmp' directory.
2024-06-06 20:24:13 +02:00
Julius Goryavsky
654f6ecec4 galera: wsrep-lib submodule update 2024-06-06 19:37:31 +02:00
Julius Goryavsky
c2d9762011 mtr: сhange the default setting for the port group size parameter
Some galera tests starts 6 galera nodes. Each galera node requires
three ports: 6*3 = 18. Plus 6 ports are needed for 6 mariadbd servers.
Since the number of ports is rounded up to 10 everywhere in mtr, we
will take 30 as the default value for the port group size parameter.
2024-06-06 19:31:28 +02:00
Daniele Sciascia
c1dc03974b MDEV-33523 Spurious deadlock error when wsrep_on=OFF
Avoid starting transactions in wsrep-lib side when wsrep is
disabled. It is unnecessary, and causes spurious deadlock errors on
transaction clean up.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-06 19:19:34 +02:00
Jan Lindström
d328705a12 MDEV-34170 : table gtid_slave_pos entries never been deleted with wsrep_gtid_mode = 0
Problem was that updates to mysql.gtid_slave_pos table were
replicated even when they were newer used and because that
newer deleted. Avoid replication of mysql.gtid_slave_pos
table if wsrep_gtid_mode=OFF.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-06 19:19:34 +02:00
Thirunarayanan Balathandayuthapani
a02773f7c0 MDEV-34057 Inconsistent FTS state in concurrent scenarios
Problem:
=======
- This commit is a merge of mysql commit 129ee47ef994652081a11ee9040c0488e5275b14.
InnoDB FTS can be in inconsistent state when sync operation
terminates the server before committing the operation. This
could lead to incorrect synced doc id and incorrect query results.

Solution:
========
- During sync commit operation, InnoDB should pass
the sync transaction to update the max doc id
in the config table.

fts_read_synced_doc_id() : This function is used
to read only synced doc id from the config table.
2024-06-06 19:09:13 +05:30
Marko Mäkelä
699d38d951 MDEV-34296 extern thread_local is a CPU waste
In commit 99bd22605938c42d876194f2ec75b32e658f00f5 (MDEV-31558)
we wrongly thought that there would be minimal overhead for accessing
a thread-local variable mariadb_stats.

It turns out that in C++11, each access to an extern thread_local
variable requires conditionally invoking an initialization function.
In fact, the initializer expression of mariadb_stats is dynamic, and
those calls were actually unavoidable.

In C++20, one could declare constinit thread_local variables, but
the address of a thread_local variable (&mariadb_dummy_stats) is not
a compile-time constant. We did not want to declare mariadb_dummy_stats
without thread_local, because then the dummy accesses could lead to
cache line contention between threads.

mariadb_stats: Declare as __thread or __declspec(thread) so that
there will be no dynamic initialization, but zero-initialization.

mariadb_dummy_stats: Remove. It is a lesser evil to let
the environment perform zero-initialization and check if
!mariadb_stats.

Reviewed by: Sergei Petrunia
2024-06-06 14:38:42 +03:00
Marko Mäkelä
9fac857f26 MDEV-34283 A misplaced btr_cur_need_opposite_intention() check may fail to prevent hangs
btr_cur_t::search_leaf(): Invoke btr_cur_need_opposite_intention() after
positioning page_cur.rec so that the record will be in the intended page.
This is something that was broken in
commit f2096478d5 or
commit de4030e4d4 or related changes.

btr_cur_need_opposite_intention(): Add a debug assertion that would
catch the misuse.

The "next line of defence" that should have caught this bug in debug builds
are assertions that mtr_t::m_memo contains MTR_MEMO_X_LOCK for the
dict_index_t::lock. When btr_cur_need_opposite_intention() holds,
we should escalate to acquiring an exclusive index->lock in
btr_cur_t::pessimistic_search_leaf().

Reviewed by: Debarun Banerjee
2024-06-06 13:03:34 +03:00
Marko Mäkelä
bc3660925d MDEV-34307 On startup, [FATAL] InnoDB: Page ... still fixed or dirty
buf_pool_invalidate(): Properly wait for
os_aio_wait_until_no_pending_writes() to ensure so that there
are no pending buf_page_t::write_complete() or buf_page_write_complete()
operations. This will avoid a failure of buf_pool.assert_all_freed().

This bug should affect debug builds only. At this point, the
buf_pool.flush_list should be clear and all changes should have
been written out. The loop around buf_LRU_scan_and_free_block() should
have eventually completed and freed all pages as soon as
buf_page_t::write_complete() had a chance to release the page latches.

It is worth noting that buf_flush_wait() is working as intended.
As soon as buf_flush_page_cleaner() invokes
buf_pool.get_oldest_modification() it will observe that
buf_page_t::write_complete() had assigned oldest_modification_ to 1,
and remove such blocks from buf_pool.flush_list. Upon reaching
buf_pool.flush_list.count=0 the buf_flush_page_cleaner() will mark
itself idle and wake buf_flush_wait() by broadcasting
buf_pool.done_flush_list.

This regression was introduced in
commit a55b951e60 (MDEV-26827).

Reviewed by: Debarun Banerjee
2024-06-06 10:18:42 +03:00
Rucha Deodhar
0406b2a4ed MDEV-34143: Server crashes when executing JSON_EXTRACT after setting
non-default collation_connection

Analysis:
Due to different collation, the string has nothing to chop off.

Fix:
Got rid of chop(), only append " ," only when we have more elements to
add to the result.
2024-06-06 11:41:01 +05:30
Vladislav Vaintroub
ce9efb4e02 MDEV-34296 tpool - declare thread_local_waiter "static thread_local" 2024-06-05 16:55:11 +02:00
Nikita Malyavin
7d86751de5 mtr: run check-testcase client process under debugger 2024-06-05 16:50:51 +02:00
mariadb-DebarunBanerjee
b12c14e3b4 MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled
When checkpoint age goes beyond the sync flush threshold and
buf_flush_sync_lsn is set, page cleaner enters into "furious flush"
stage to aggressively flush dirty pages from flush list and pull
checkpoint LSN above safe margin. In this stage, page cleaner skips
doing LRU flush and eviction.

In 10.6, all other threads entirely rely on page cleaner to generate
free pages. If free pages get over while page cleaner is busy in
"furious flush" stage, a session thread could wait for free page in the
middle of a min-transaction(mtr) while holding latches on other pages.

It, in turn, can prevent page cleaner to flush such pages preventing
checkpoint LSN to move forward creating a deadlock situation. Even
otherwise, it could create a stall and hang like situation for large BP
with plenty of dirty pages to flush before the stage could finish.

Fix: During furious flush, check and evict LRU pages after each flush
iteration.
2024-06-05 18:11:29 +05:30
Vladislav Vaintroub
db9c2d225e fix typo 2024-06-05 14:28:50 +02:00
Vladislav Vaintroub
bfd3f45e8e Appveyor - better filtering for branches to match buildbot 2024-06-05 12:26:46 +02:00
Vladislav Vaintroub
b242b44f0a Appveyor build - skip irrelevant commits
Since we're only building on Windows, skip changes to debian directory
and to shell scripts.
2024-06-05 12:13:33 +02:00