Commit graph

196238 commits

Author SHA1 Message Date
mkaruza
136358036d MDEV-18590: galera.versioning_trx_id: Test failure: mysqltest: Result content mismatch
Replicated events have time associated with them from originating
node which will be used for commit timestamp. Associated time can
be set in past before event is even applied.

For WSREP replication we don't need to use time information from
event.

Addressed review comments:
	  Jan Lindström <jan.lindstrom@galeracluster.com>

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-27 18:40:58 +02:00
Jan Lindström
1532f12058 MDEV-33898 : Galera test failure on galera.MW-369
Tests using MW-369.inc sometimes hanged after
signaling two debug sync points inside a Galera
library. Replaced Galera library sync point
with server code sync point when possible and
added more wait_conditions to make sure we are
in correct state.

Tests effected: MW-369, MW-402, MDEV-27276, and
mysql-wsrep#332.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-26 20:21:44 +02:00
Julius Goryavsky
288ea9e146 galera SST scripts: parsing CN in certificates
This commit contains a fix for the code that extracts and parses
the CN (common name, domain name) record from certificates using
the openssl utility. This code is also made common to the rsync
and mariabackup scripts. There is also some systematization of
the use of 'printf' and 'echo' builtins/utilities.
2024-04-26 20:21:44 +02:00
Oleksandr Byelkin
ee59ca7ff1 Merge branch 'merge-zlib' (1.3.1) into 10.4 2024-04-26 13:50:03 +02:00
Hugo Wen
3d41747625 MDEV-33574 Improve mysqlbinlog error message
Previously, when running mysqlbinlog without providing a binlog file, it
would print the entire help text, which was very verbose and made it
difficult to identify the actual issue.

Now change the behavior to print a more concise error message instead:

    "ERROR: Please provide the log file(s). Run with '--help' for usage instructions."

This makes the error output more user-friendly and easier to understand,
especially when running the tool in scripts or automated processes.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
2024-04-26 12:27:28 +01:00
Oleksandr Byelkin
5aff13b65c zlib 1.3.1 2024-04-26 13:18:51 +02:00
Oleksandr Byelkin
45846bacb3 v5.7.0-stable 2024-04-26 13:02:47 +02:00
Jan Lindström
b3e531a3cc MDEV-33896 : Galera test failure on galera_3nodes.MDEV-29171
Based on logs we might start SST before donor has reached
Primary state. Because this test shutdowns all nodes we
need to make sure when we start nodes that previous nodes
have reached Primary state and joined the cluster.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-25 16:32:06 +02:00
Marko Mäkelä
10d251e05a MDEV-26450 fixup: Remove a bogus assertion
mtr_t::commit_shrink(): Do not assert that some previously clean pages
will be flagged as modified by this mini-transaction. It could be the
case that there had been no recent write-back of any of the undo
tablespace pages that we are modifying when truncating the tablespace.
It suffices to assert that some pages were modified again:
ut_ad(m_modifications).

This fixes up commit f5fddae3cb
2024-04-25 15:52:38 +03:00
Oleksandr Byelkin
62287320d4 MDEV-33790 Incorrect DEFAULT expression evaluated in UPDATE
The problem was that Item_default_value::associate_with_target_field
assigned passed as argument field as an argument which changed argument
in case of default() call with certain field (i.e. deault(field)).

There is no way to get wrong field in constructor so we will not reassign
parameter.
2024-04-25 14:11:28 +02:00
Marko Mäkelä
77d5104fee Remove a bogus workaround for old GCC
At least starting with ca83115b3e
the source code cannot be compiled with anything older than GCC 4.8.5.

Furthermore, 64-bit atomic read-modify-write operations on IA-32
would depend on the LOCK CMPXCHG8B instruction, which was introduced
in the Intel Pentium. Our IA-32 builds ought to be -march=i686
starting with commit 9cabc9fd8a.

Approved by Sergei Golubchik
2024-04-25 12:58:32 +03:00
Kristian Nielsen
553a4d6271 MDEV-33602: Sporadic test failure in rpl.rpl_gtid_stop_start
The test could fail with a duplicate key error because switching to non-GTID
mode could start at the wrong old-style position. The position could be
wrong when the previous GTID connect was stopped before receiving the fake
GTID list event which gives the old-style position corresponding to the GTID
connected position.

Work-around by injecting an extra event and syncing the slave before
switching to non-GTID mode.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-25 11:00:45 +02:00
Marko Mäkelä
a1c1f5029e MDEV-33974 Enable GNU libstdc++ debugging
Starting with GCC 10, let us enable _GLIBCXX_DEBUG as well as
_GLIBCXX_ASSERTIONS which have an impact on the GNU libstdc++.
On GCC 8, we observed a compilation failure related to some
missing type conversion.

Even though clang on GNU/Linux would default to using libstdc++
and enabling the debugging seems to work with clang-18, we will
not enable this on clang, in case it would lead to compilation
errors.

For the clang libc++ before clang-15 there was _LIBCPP_DEBUG,
but according to
llvm/llvm-project@f3966eaf86 and
llvm/llvm-project@13ea134323 and
llvm/llvm-project@ff573a42cd it
looks like that for proper results, a specially built debug version
of libc++ would have to be used in order to enable equivalent checks.

This should help catch bugs like the one that
commit 455a15fd06 fixed.

Reviewed by: Sergei Golubchik
2024-04-25 11:05:03 +03:00
Marko Mäkelä
7229384256 MDEV-23974 fixup: Cover all debug builds
While commit 75b7cd680b was a significant
improvement, we occasionally got test failures of debug builds. One of
the affected tests is innodb.innodb-64k-crash.
2024-04-25 07:48:57 +03:00
Sergei Golubchik
7d5e08de6b MDEV-20157 perfschema.stage_mdl_function failed in buildbot with wrong result
MDL wait consists of short 1 second waits (this is not configurable)
repeated until lock_wait_timeout is reached. The stage is changed
to Waiting and back every second. To have predictable result in the
test the query should filter all sequences of X, "Waiting for MDL", X,
leaving just X.
2024-04-24 18:09:58 +02:00
Sergei Golubchik
259394aed7 disable mariabackup.incremental_encrypted,64k on 32bit
it allocates 1GB of memory, it causes failures in CI
2024-04-24 18:09:20 +02:00
Sergei Golubchik
e2f95ebbcb fix galera_3nodes.galera_gtid_consistency to work with nc
like other galera tests do
2024-04-24 18:09:20 +02:00
Thirunarayanan Balathandayuthapani
0c55d854fe MDEV-33334 mariadb-backup fails to preserve innodb_encrypt_tables
Problem:
========
mariabackup --prepare fails to write the pages in encrypted format.
This issue happens only for default encrypted table when
innodb_encrypt_tables variable is enabled.

Fix:
====
backup process should write the value of innodb_encrypt_tables
variable in configuration file. prepare should enable the
variable based on configuration file.
2024-04-24 16:27:31 +05:30
Meng-Hsiu Chiang
55cb2c2916 MDEV-29955: Set path for zlib library with pkg-config
`FindZLIB` module uses variable `ZLIB_ROOT`[1] to look for libraries. By
setting the variable, `FindZLIB` is able to search the libraries that
installed in a non-system path (/workspace/mylib for example).

And when using `z` in `LINK_LIBRARIES()` CMake tries to lookup the
library in system path by default. It doesn't work if the library isn't
installed in the path, and use ${ZLIB_LIBRARY} which set by FindZLIB
solve the issue.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.

[1]: https://cmake.org/cmake/help/latest/module/FindZLIB.html#hints
2024-04-23 15:40:57 +01:00
Sergei Golubchik
e73181112f MDEV-16944 fix galera tests
followup for 061adae9a2
2024-04-23 10:55:35 +02:00
Alexander Barkov
e02077aa03 MDEV-21076 NOT NULL and UNIQUE constraints cause SUM() to yield an incorrect result
This problem was earlier fixed by the patch for MDEV 33344.
Adding a test case only.
2024-04-23 10:45:39 +04:00
Alexander Barkov
24abbb9bdb MDEV-21034 GREATEST() and LEAST() malfunction for NULL
There is a convention that Item::val_int() and Item::val_real() return
SQL NULL doing effectively what this code does:
  null_value= true;
  return 0; // Always return 0 for SQL NULL

This is done to optimize boolean value evaluation:
if Item::val_int() or Item::val_real() returned 1 -
that always means TRUE and never can means SQL NULL.
This convention helps to avoid unnecessary testing
Item::null_value after getting a non-zero return value.

Item_func_min_max did not follow this convention.
It could return a non-zero value together with null_value==true.
This made evaluate_join_record() erroneously misinterpret
SQL NULL as TRUE in this call:

  select_cond_result= MY_TEST(select_cond->val_int());

Fixing Item_func_min_max to follow the convention.
2024-04-23 02:38:26 +04:00
Markus Staab
361b790392 Remove unnecessary whitespace in mysqldump 2024-04-22 21:41:29 +01:00
Jan Lindström
a2fee2da0b MDEV-33928 : Assertion failure on wsrep_thd_is_aborting
Problem was assertion assuming we always hold
THD::LOCK_thd_data mutex that is not true.
In most cases this is true but function is
also used from InnoDB lock manager and
there we can't take THD::LOCK_thd_data to
obey mutex ordering. Removed assertion as
wsrep transaction state can't change even
that case.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-04-22 16:57:48 +02:00
Kristian Nielsen
57f6a1ca98 MDEV-19415: use-after-free on charsets_dir from slave connect
The slave IO thread sets MYSQL_SET_CHARSET_DIR. The code for this option
however is not thread-safe in sql-common/client.c. The value set is
temporarily written to mysys global variable `charsets-dir` and can be seen
by other threads running in parallel, which can result in use-after-free
error.

Problem was visible as random failures of test cases in suite multi_source
with Valgrind or MSAN.

Work-around by not setting this option for slave connect, it is redundant
anyway as it is just setting the default value.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-20 13:41:08 +02:00
Kristian Nielsen
0c249ad718 MDEV-30232: rpl.rpl_gtid_crash fails sporadically in BB
The root cause of the failure is a bug in the Linux network stack:

  https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u

If the slave does a connect(2) at the exact same time that kill -9 of the
master process closes the listening socket, the FIN or RST packet is lost in
the kernel, and the slave ends up timing out waiting for the initial
communication from the server. This timeout defaults to
--slave-net-timeout=120, which causes include/master_gtid_wait.inc to time
out first and fail the test.

Work-around this problem by reducing the --slave-net-timeout for this test
case. If this problem turns up in other tests, we can consider reducing the
default value for all tests.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-20 13:41:08 +02:00
Sergei Golubchik
4a2e03453a MDEV-33952 galera_create_table_as_select fails sporadically
disable until fixed
2024-04-19 22:09:41 +02:00
Zhibo Zhang
7432a487b1 Update tests to be compatible with OpenSSL 3.2.0
As of version 3.2.0, OpenSSL updated the error message in new versions
("https://github.com/openssl/openssl/commit/81b741f68984"). Update the
tests and result files such that they are compatible with both original
and new error messages.

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services,
Inc.
2024-04-19 15:44:28 +01:00
Marko Mäkelä
4c34339426 MDEV-33946: OPT_PAGE_CHECKSUM mismatch due to mtr_t::memmove()
mtr_t::memmove(): Revert to the parent of
commit a032f14b34
where there was supposed to be an equivalent change
that would avoid hitting a warning in some old version of GCC
when this change was part of another 10.6 based developmet branch.

For some reason, this change is not equivalent but will cause
massive amounts of backup failures in the stress tests
run by Matthias Leich, caught by
commit 4179f93d28 in 10.6.
2024-04-19 15:46:21 +03:00
Vladislav Vaintroub
2e84560dc4 MDEV-16944 postfix. Fix a typo 2024-04-18 09:45:41 +02:00
mariadb-DebarunBanerjee
5928e04d5f MDEV-32489 Change buffer index fails to delete the records
When the change buffer records for a page span across multiple change
buffer leaf pages or the starting record is at the beginning of a page
with a left sibling, ibuf_delete_recs deletes only the records in first
page and fails to move to subsequent pages.

Subsequently a slow shutdown hangs trying to delete those left over
records.

Fix-A: Position the cursor to an user record in B-tree and exit only
when all records are exhausted.

Fix-B: Make sure we call ibuf_delete_recs during slow shutdown for
pages with IBUF entries to cleanup any previously left over records.
2024-04-18 08:30:21 +05:30
Brandon Nesterenko
0ad52e4d6a MDEV-27512: Assertion !thd->transaction_rollback_request failed in rows_event_stmt_cleanup
If replicating an event in ROW format, and InnoDB detects a deadlock
while searching for a row, the row event will error and rollback in
InnoDB and indicate that the binlog cache also needs to be cleared,
i.e. by marking thd->transaction_rollback_request. In the normal
case, this will trigger an error in Rows_log_event::do_apply_event()
and cause a rollback. During the Rows_log_event::do_apply_event()
cleanup of a successful event application, there is a DBUG_ASSERT in
log_event_server.cc::rows_event_stmt_cleanup(), which sets the
expectation that thd->transaction_rollback_request cannot be set
because the general rollback (i.e. not the InnoDB rollback) should
have happened already. However, if the replica is configured to skip
deadlock errors, the rows event logic will clear the error and
continue on, as if no error happened. This results in
thd->transaction_rollback_request being set while in
rows_event_stmt_cleanup(), thereby triggering the assertion.

This patch fixes this in the following ways:
 1) The assertion is invalid, and thereby removed.
 2) The rollback case is forced in rows_event_stmt_cleanup() if
transaction_rollback_request is set.

Note the differing behavior between transactions which are skipped
due to deadlock errors and other errors. When a transaction is
skipped due to an ignored deadlock error, the entire transaction is
rolled back and skipped (though note MDEV-33930 which allows
statements in the same transaction after the deadlock-inducing one
to commit). When a transaction is skipped due to ignoring a
different error, only the erroring statements are rolled-back and
skipped - the rest of the transaction will execute as normal. The
effect of this can be seen in the test results. The added test case
to rpl_skip_error.test shows that only statements which are ignored
due to non-deadlock errors are ignored in larger transactions. A
diff between rpl_temporary_error2_skip_all.result and
rpl_temporary_error2.result shows that all statements in the errored
transaction are rolled back (diff pasted below):

: diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result
49c49
< 2	1
---
> 2	NULL
51c51
< 4	1
---
> 4	NULL
53c53
< * There will be two rows in t2 due to the retry.
---
> * There will be one row in t2 because the ignored deadlock does not retry.
57d56
< 1
59c58
< 1
---
> 0

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2024-04-17 11:14:21 -06:00
Vladislav Vaintroub
061adae9a2 MDEV-16944 Fix file sharing issues on Windows in mysqltest
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.

mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.

But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via

--echo exec "some text" > output_file

In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.

This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.

Internally, this command will use my_open(), and therefore retry-on-error
logic.

Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.
2024-04-17 16:52:37 +02:00
Vladislav Vaintroub
b48de9737b Remove duplicate key "Language" from .clang-format
Latest Visual Studio complains about invalid format, it breaks formatting
in the IDE
2024-04-17 16:52:37 +02:00
Vladislav Vaintroub
173847b76a Do not run maria_recover_encrypted with embedded.
It uses shutdown/restart etc, features not compatible the embedded.

also add have_debug.inc , since it uses debug_dbug variable
2024-04-17 16:52:17 +02:00
Vladislav Vaintroub
e87a175b3a Fix LTO (aka interprocedural optimization) build with MSVC
Also, disable MSVC LTO for static client libraries - they won't be usable
for end-users.
2024-04-17 16:18:44 +02:00
mariadb-DebarunBanerjee
040069f4ba MDEV-33431 Latching order violation reported fil_system.sys_space.latch and ibuf_pessimistic_insert_mutex
Issue:
------
The actual order of acquisition of the IBUF pessimistic insert mutex
(SYNC_IBUF_PESS_INSERT_MUTEX) and IBUF header page latch
(SYNC_IBUF_HEADER) w.r.t space latch (SYNC_FSP) differs from the order
defined in sync0types.h. It was not discovered earlier as the path to
ibuf_remove_free_page was not covered by the mtr test. Ideal order and
one defined in sync0types.h is as follows.
SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX -> SYNC_FSP

In ibuf_remove_free_page, we acquire space latch earlier and we have
the order as follows resulting in the assert with innodb_sync_debug=on.
SYNC_FSP -> SYNC_IBUF_HEADER -> SYNC_IBUF_PESS_INSERT_MUTEX

Fix:
---
We do maintain this order in other places and there doesn't seem to be
any real issue here. To reduce impact in GA versions, we avoid doing
extensive changes in mutex ordering to match the current
SYNC_IBUF_PESS_INSERT_MUTEX order. Instead we relax the ordering check
for IBUF pessimistic insert mutex using SYNC_NO_ORDER_CHECK.
2024-04-17 15:16:50 +05:30
Vladislav Vaintroub
f6e9600f42 MDEV-33840 tpool- switch to longer maintainence timer interval, if pool is idle
Previous solution, that would entirely switch timer off, turned out
to be deadlock prone.

This patch fixed previous attempt to switch between long/short interval
periods in MDEV-24295. Now, initial state of the timer is fixed (it is ON).
Also, avoid switching timer to longer periods if there is any activity in
the pool.
2024-04-17 10:49:59 +02:00
Vladislav Vaintroub
2ba79aba2b Revert "MDEV-33840 tpool : switch off maintenance timer when not needed."
This reverts commit 09bae92c16.
2024-04-17 09:58:34 +02:00
Marko Mäkelä
3a3fe3005d Merge 10.4 into 10.5 2024-04-17 10:10:26 +03:00
Marko Mäkelä
9164c2b8bb Tests: remove a duplicated check
This fixes up the merge commit 9b18275623
2024-04-17 10:10:23 +03:00
Jan Lindström
4aeba2590b MDEV-33895 : Galera test failure on galera_sr.MDEV-25718
Test was waiting INSERT-clause to make rollback but
wait_condition was too tight. State could be
Freeing items or Rollback. Fixed wait_condition
to expect one of them.
2024-04-17 09:41:15 +03:00
Sergei Golubchik
41e7ceb0ac MDEV-33889 Read only server throws error when running a create temporary table as select statement
create_partitioning_metadata() should only mark transaction r/w
if it actually did anything (that is, the table is partitioned).

otherwise it's a no-op, called even for temporary tables and
it shouldn't do anything at all
2024-04-16 20:43:31 +02:00
Oleksandr Byelkin
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
Oleksandr Byelkin
50998a6c6f MDEV-33861 main.query_cache fails with embedded after enabling WITH_PROTECT_STATEMENT_MEMROOT
Synopsis: If SELECT returned answer from Query Cache it is not really executed.

The reason for firing of assertion
  DBUG_ASSERT((mem_root->flags & ROOT_FLAG_READ_ONLY) == 0);
is that in case the query_cache is on and the same query run by different
stored routines the following use case can take place:
First, lets say that bodies of routines used by the test case are the same
and contains the only query 'SELECT * FROM t1';
  call p1() -- a result set is stored in query cache for further use.
  call p2() -- the same query is run against the table t1, that result in
               not running the actual query but using its cached result.
               On finishing execution of this routine, its memory root is
               marked for read only since every SP instruction that this
               routine contains has been executed.
  INSERT INT t1 VALUE (1); -- force following invalidation of query cache
  call p2() -- query the table t1 will result in assertion failure since its
               execution would require allocation on the memory root that
               has been already marked as read only memory root

The root cause of firing the assertion is that memory root of the stored
routine 'p2' was marked as read only although actual execution of the query
contained inside hadn't been performed.

To fix the issue, mark a SP instruction as not yet run in case its execution
doesn't result in real query processing and a result set got from query cache
instead.

Note that, this issue relates server built in debug mode AND with the protect
statement memory root feature turned on. It doesn't affect server built
in release mode.
2024-04-16 08:52:51 +02:00
Kristian Nielsen
ce104d4171 Fix windows build failure
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 18:54:30 +02:00
Kristian Nielsen
16aa4b5f59 Merge from 10.4 to 10.5
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 17:46:49 +02:00
Kristian Nielsen
10272f3709 Distinguish "manager stopped" from "manager not started"
This way, if manager thread somehow starts and stops again quickly before
main thread wakes up to check if it started correctly, we will not hang.

Patch suggested by Monty as follow-up to
7f498fbab8

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-04-15 10:23:46 +02:00
Marko Mäkelä
a032f14b34 MDEV-33559 matched_rec::block should be allocated from the buffer pool
matched_rec::rec_buf[], matched_rec::bufp: Remove.

matched_rec::block: Make this a pointer to something that
is allocated by buf_block_alloc(). In this way, the only
case where buf_block_t is constructed outside buf_pool
is ALTER TABLE...IMPORT TABLESPACE.

rtr_info::heap: Remove. This was only used for allocating matched_rec,
which now is smaller.

mtr_t::memmove(): Simplify some code to avoid GCC 9.4.0 -Wconversion
in the 10.6 branch as a result of these changes.

Reviewed by: Debarun Banerjee
2024-04-15 09:04:11 +03:00
Daniel Black
ea810b04cb MDEV-30676 rpl.parallel_backup* tests sometimes fail
Raise innodb_lock_wait_timeout from 1 to 5
2024-04-15 15:45:03 +10:00