the test waits for the event to get stuck on MASTER_DELAY,
but on a slow/overloaded slave the event might pass MASTER_DELAY
before the test starts waiting.
Wait for the event to get stuck on the LOCK TABLES (after MASTER_DELAY),
the event cannot avoid that,
Extends 89c907bd4f to account for
binlog_two_phase_alter flags in a Gtid log event. I.e., if the
FL_COMMIT_ALTER_E1 or FL_ROLLBACK_ALTER_E2 flags are set in the
event flags, yet the length of the event is too short to hold
the value, then set the event as invalid
pmem_phwsync(): The implementation for POWER ISA v3.1 that is
compatible with libpmem.
pmem_fence(): A dummy implementation for older ISA. While such systems
are unlikely to support MAP_SYNC memory mappings, this could be useful
when running tests with memory-mapped /dev/shm/*/ib_logfile0
(the "fake PMEM"), to ensure that mariadb-backup will be able to
read the latest redo log contents.
pmem_persist_init(): Check the availability of POWER ISA v3.1.
Thanks to Daniel Black for suggesting this.
I checked all stack overflow potential problems found with
gcc -Wstack-usage=16384
and
clang -Wframe-larger-than=16384 -no-inline
Fixes:
Added '#pragma clang diagnostic ignored "-Wframe-larger-than="'
to a lot of function to where stack usage large but resonable.
- Added stack check warnings to BUILD scrips when using clang and debug.
Function changed to use malloc instead allocating things on stack:
- read_bootstrap_query() now allocates line_buffer (20000 bytes) with
malloc() instead of using stack. This has a small performance impact
but this is not releant for bootstrap.
- mroonga grn_select() used 65856 bytes on stack. Changed it to use
malloc().
- Wsrep_schema::replay_transaction() and
Wsrep_schema::recover_sr_transactions().
- Connect zipOpen3()
Not fixed:
- mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses
43712 byte on stack. However this is not easy to fix as the stack
used is caused by a lot of code generated by defines.
- Most changes in mroonga/groonga where only adding of pragmas to disable
stack warnings.
- rocksdb/options/options_helper.cc uses 20288 of stack space.
(no reason to fix except to get rid of the compiler warning)
- Causes using alloca() where the allocation size is resonable.
- An issue in libmariadb (reported to connectors).
In case of partition insert, InnoDB fails to end the bulk insert
for one of the partition. It leads to bulk insert operation for
the consecutive delete statement.
trx_t::bulk_insert_apply_for_table(): Irrespective of bulk insert
value, InnoDB should end the bulk insert for the table.
auth_map.so isn't guaranteed to be available. Fedora packages it
separately.
The --base-dir path of mysql_install_db.sh seems to contain
historicial heuristics that have been replaced on other branches
of the script.
We attempt to do the same here placing the basedir original paths
so that all components are absolute.
Problem was assertion assuming we always hold
THD::LOCK_thd_data mutex that is not true.
In most cases this is true but function is
also used from InnoDB lock manager and
there we can't take THD::LOCK_thd_data to
obey mutex ordering. Removed assertion as
wsrep transaction state can't change even
that case.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
first stop the slave, then run commands on the master that are
supposed to fail on the slave, then start the slave.
if you swap first two steps, the slave might get and execute those
commands before it's stopped, which will fail the test.
also, improve debugability
in the $case=2 - it's wrong to kill after the first binlog EOF,
because that might happen between INSERT(4) and INSERT(5).
So, wait for the slave to acknowledge INSERT(5) before killing
the master, that is, both connection threads must pass
repl_semisync_master.wait_after_sync()
The slave IO thread sets MYSQL_SET_CHARSET_DIR. The code for this option
however is not thread-safe in sql-common/client.c. The value set is
temporarily written to mysys global variable `charsets-dir` and can be seen
by other threads running in parallel, which can result in use-after-free
error.
Problem was visible as random failures of test cases in suite multi_source
with Valgrind or MSAN.
Work-around by not setting this option for slave connect, it is redundant
anyway as it is just setting the default value.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
The root cause of the failure is a bug in the Linux network stack:
https://lore.kernel.org/netdev/87sf0ldk41.fsf@urd.knielsen-hq.org/T/#u
If the slave does a connect(2) at the exact same time that kill -9 of the
master process closes the listening socket, the FIN or RST packet is lost in
the kernel, and the slave ends up timing out waiting for the initial
communication from the server. This timeout defaults to
--slave-net-timeout=120, which causes include/master_gtid_wait.inc to time
out first and fail the test.
Work-around this problem by reducing the --slave-net-timeout for this test
case. If this problem turns up in other tests, we can consider reducing the
default value for all tests.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
As of version 3.2.0, OpenSSL updated the error message in new versions
("https://github.com/openssl/openssl/commit/81b741f68984"). Update the
tests and result files such that they are compatible with both original
and new error messages.
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services,
Inc.
mtr_t::memmove(): Revert to the parent of
commit a032f14b34
where there was supposed to be an equivalent change
that would avoid hitting a warning in some old version of GCC
when this change was part of another 10.6 based developmet branch.
For some reason, this change is not equivalent but will cause
massive amounts of backup failures in the stress tests
run by Matthias Leich, caught by
commit 4179f93d28 in 10.6.
ibuf_remove_free_page(): Correct the calculation of root_savepoint().
The first entry acquired by ibuf_tree_root_get() will be ibuf.index.lock
and not the change buffer root page.
Thanks to Matthias Leich for finding this bug in RQG.
Unfortunately, this code is very difficult to cover
in our regression test suite.
The libpmem dependency that had been added in
commit 3daef523af (MDEV-17084)
did not achieve any measurable performance improvement when
comparing the same PMEM device with and without "mount -o dax"
using the Linux ext4 file system.
Because Red Hat has deprecated libpmem, let us remove the code
altogether.
Note: This is a 10.6 version of
commit 3f9f5ca48e
which will retain PMEM support in MariaDB Server 10.11.
Because the Red Hat Enterprise Linux 8 core repository does not include
libpmem, let us implement the necessary subset ourselves.
pmem_persist(): Implement for 64-bit x86, ARM, POWER, RISC-V, Loongarch
in a way that should be compatible with the https://github.com/pmem/pmdk/
implementation of pmem_persist().
The CMake option WITH_INNODB_PMEM can be used for enabling or disabling
this interface at compile time. By default, it is enabled on all applicable
systems that are covered by our CI system.
Note: libpmem had not been previously enabled for Loongarch in our
Debian packaging. It was enabled for RISC-V, but we will not enable it
by default on RISC-V or Loongarch because we lack CI coverage.
The generated code for x86_64 was reviewed and tested on two
Intel implementations: one that only supports clflush, and
another that supports both clflushopt and clwb.
The generated machine code was also reviewed on https://godbolt.org
using various compiler versions. Godbolt helpfully includes an option
to compile to binary code and display the encoding, which was
useful on POWER.
Reviewed by: Vladislav Vaintroub
bulk_insert_apply_for_table(dict_table_t*)
This issue is caused by
commit 188c5da72a (MDEV-32453).
trx_t::bulk_insert_apply_for_table(): Remove the assert
check_unique_secondary and check_foreigns. InnoDB can
apply the bulk insert operation even after disabling
the check_foreigns and check_unique_secondary variable.
When the change buffer records for a page span across multiple change
buffer leaf pages or the starting record is at the beginning of a page
with a left sibling, ibuf_delete_recs deletes only the records in first
page and fails to move to subsequent pages.
Subsequently a slow shutdown hangs trying to delete those left over
records.
Fix-A: Position the cursor to an user record in B-tree and exit only
when all records are exhausted.
Fix-B: Make sure we call ibuf_delete_recs during slow shutdown for
pages with IBUF entries to cleanup any previously left over records.
- When `libcurl` is installed in path out of default path, like on
Windows, `include_directories` failed to find `curl/curl.h`.
- Fix `cmake` by using modern syntax with imported target and
`find_package`
- Fix warnings treated as the errors
- Remove `HASHICORP_HAVE_EXCEPTIONS` macro and related code
- Add package to `Server` component in Windows
- Tested with `$ ./mysql-test/mtr --suite=vault`
- Closes PR #3068
- Reviewer: <wlad@mariadb.com>
<julius.goryavsky@mariadb.com>
If replicating an event in ROW format, and InnoDB detects a deadlock
while searching for a row, the row event will error and rollback in
InnoDB and indicate that the binlog cache also needs to be cleared,
i.e. by marking thd->transaction_rollback_request. In the normal
case, this will trigger an error in Rows_log_event::do_apply_event()
and cause a rollback. During the Rows_log_event::do_apply_event()
cleanup of a successful event application, there is a DBUG_ASSERT in
log_event_server.cc::rows_event_stmt_cleanup(), which sets the
expectation that thd->transaction_rollback_request cannot be set
because the general rollback (i.e. not the InnoDB rollback) should
have happened already. However, if the replica is configured to skip
deadlock errors, the rows event logic will clear the error and
continue on, as if no error happened. This results in
thd->transaction_rollback_request being set while in
rows_event_stmt_cleanup(), thereby triggering the assertion.
This patch fixes this in the following ways:
1) The assertion is invalid, and thereby removed.
2) The rollback case is forced in rows_event_stmt_cleanup() if
transaction_rollback_request is set.
Note the differing behavior between transactions which are skipped
due to deadlock errors and other errors. When a transaction is
skipped due to an ignored deadlock error, the entire transaction is
rolled back and skipped (though note MDEV-33930 which allows
statements in the same transaction after the deadlock-inducing one
to commit). When a transaction is skipped due to ignoring a
different error, only the erroring statements are rolled-back and
skipped - the rest of the transaction will execute as normal. The
effect of this can be seen in the test results. The added test case
to rpl_skip_error.test shows that only statements which are ignored
due to non-deadlock errors are ignored in larger transactions. A
diff between rpl_temporary_error2_skip_all.result and
rpl_temporary_error2.result shows that all statements in the errored
transaction are rolled back (diff pasted below):
: diff rpl_temporary_error2.result rpl_temporary_error2_skip_all.result
49c49
< 2 1
---
> 2 NULL
51c51
< 4 1
---
> 4 NULL
53c53
< * There will be two rows in t2 due to the retry.
---
> * There will be one row in t2 because the ignored deadlock does not retry.
57d56
< 1
59c58
< 1
---
> 0
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
On Windows systems, occurrences of ERROR_SHARING_VIOLATION due to
conflicting share modes between processes accessing the same file can
result in CreateFile failures.
mysys' my_open() already incorporates a workaround by implementing
wait/retry logic on Windows.
But this does not help if files are opened using shell redirection like
mysqltest traditionally did it, i.e via
--echo exec "some text" > output_file
In such cases, it is cmd.exe, that opens the output_file, and it
won't do any sharing-violation retries.
This commit addresses the issue by introducing a new built-in command,
'write_line', in mysqltest. This new command serves as a brief alternative
to 'write_file', with a single line output, that also resolves variables
like "exec" would.
Internally, this command will use my_open(), and therefore retry-on-error
logic.
Hopefully this will eliminate the very sporadic "can't open file because
it is used by another process" error on CI.