There is a server startup option --gdb a.k.a. --debug-gdb that requests
signals to be set for more convenient debugging. Most notably, SIGINT
(ctrl-c) will not be ignored, and you will be able to interrupt the
execution of the server while GDB is attached to it.
When we are debugging, the signal handlers that would normally display
a terse stack trace are useless.
When we are debugging with rr, the signal handlers may interfere with
a SIGKILL that could be sent to the process by the environment, and ruin
the rr replay trace, due to a Linux kernel bug
https://lkml.org/lkml/2021/10/31/311
To be able to diagnose bugs in kill+restart tests, we may really need
both a trace before the SIGKILL and a trace of the failure after a
subsequent server startup. So, we had better avoid hitting the problem
by simply not installing those signal handlers.
The crash happened because my_isalnum() does not support character
sets with mbminlen>1.
The value of "ft_boolean_syntax" is converted to utf8 in do_string_check().
So calling my_isalnum() is combination with "default_charset_info" was wrong.
Adding new parameters (size_t length, CHARSET_INFO *cs) to
ft_boolean_check_syntax_string() and passing self->charset(thd)
as the character set.
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.
In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.
TOI replication is used, in this approach, purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.
This patch also fixes mutex locking order and unprotected
THD member accesses on bf aborting case. We try to hold
THD::LOCK_thd_data during bf aborting. Only case where it
is not possible is at wsrep_abort_transaction before
call wsrep_innobase_kill_one_trx where we take InnoDB
mutexes first and then THD::LOCK_thd_data.
This will also fix possible race condition during
close_connection and while wsrep is disconnecting
connections.
Added wsrep_bf_kill_debug test case
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
before change test:
strace -fe trace=file -o /tmp/f.strace sql/mysqld --datadir=/tmp/d --log-bin=foo-bin --help --verbose && ls -la /tmp/
...
'mysqladmin variables' instead of 'mysqld --verbose --help'.
total 0
drwxrwxr-x. 2 dan dan 60 May 19 18:05 .
drwxrwxrwt. 27 root root 640 May 19 18:03 ..
-rw-rw----. 1 dan dan 0 May 19 18:05 foo-bin.index
Problem is that not all plugins are loaded when wsrep_recover is executed.
Thus, we allow unknown system variables and extra system variables during
wsrep_recover. Any unknown system variables would still be caught when
the server starts up normally after the SST.
Passing a null pointer to the "%s" argument of a printf-like
function is undefined behaviour. In the GNU libc implementation
of the printf() family of functions, it happens to work.
GCC 10.2.0 would diagnose this with -Wformat-overflow -Og.
In -fsanitize=undefined (WITH_UBSAN=ON) builds, a runtime error
would be generated. In some other builds, GCC 8 or later might infer
that the parameter is nonnull and optimize away further checks whether
the parameter is null, leading to SIGSEGV.
Server auto-sets lower_case_file_system value based on default
datadir's behavior instead of instead of using the directory specified
by the user through the configuration file or command line options.
This patch fixes this problem.
Server auto-sets lower_case_file_system value based on default
datadir's behavior instead of instead of using the directory specified
by the user through the configuration file or command line options.
This patch fixes this problem.
MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The only exception is the
C runtime library. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.
In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.
Note: Before MariaDB Server 10.5, ./mtr will typically fail
due to the old PCRE library, which was updated in MDEV-14024.
The following cmake options were tested on 10.5
in commit 94d0bb4dbe:
cmake \
-DCMAKE_C_FLAGS='-march=native -O2' \
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \
-DWITH_SAFEMALLOC=OFF \
-DWITH_{ZLIB,SSL,PCRE}=bundled \
-DHAVE_LIBAIO_H=0 \
-DWITH_MSAN=ON
MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and __msan_unpoison().
MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for
VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow().
InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros.
ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
This reverts commit 6f1f911497.
because it doesn't do anything now (the server doesn't check
my_disable_leak_check) and it never did anything before
(because without `extern` it simply created a local instance of
my_disable_leak_check, did not affect server's my_disable_leak_check).
This is a backport of the applicable part of
commit 93475aff8d and
commit 2c39f69d34
from 10.4.
Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.
WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c32
when reverting WSREP_ON to its previous definition.
We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8d. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.
Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
Since commit 7198c6ab2d
the ./mtr --embedded tests would fail to start innodb_plugin
because of an undefined reference to the symbol wsrep_log().
Let us define a stub for that function. The embedded server
is never built WITH_WSREP, but there are no separate storage
engine builds for the embedded server. Hence, by default,
the dynamic InnoDB storage engine plugin would be built WITH_WSREP
and it would fail to load into the embedded server library due to
a reference to the undefined symbol.
Part of:
MDEV-21056 Assertion `global_status_var.global_memory_used == 0'
failed upon shutdown after query with DEFAULT on a geometry
field
Fixed by changing the ASSERT for memory leaks to a printf() on
stderr. This has needed as all mutex in mysys has been deleted and we
can't call functions like my_open() anymore.
Also added printing of leaks if safemalloc is used (like we do in 10.5)
Proper C-style type erasure is done via void*, not via char* or something else.
free_key_cache()
free_rpl_filter(): types were fixed to avoid function pointer type cast which
is still undefined behavior.
Note, that casting from void* to any other pointer type is safe and correct.
For release builds, do not declare unused variables.
unpack_row(): Omit a debug-only variable from WSREP diagnostic message.
create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
This allows one to run the test suite even if any of the following
options are changed:
- character-set-server
- collation-server
- join-cache-level
- log-basename
- max-allowed-packet
- optimizer-switch
- query-cache-size and query-cache-type
- skip-name-resolve
- table-definition-cache
- table-open-cache
- Some innodb options
etc
Changes:
- Don't print out the value of system variables as one can't depend on
them to being constants.
- Don't set global variables to 'default' as the default may not
be the same as the test was started with if there was an additional
option file. Instead save original value and reset it at end of test.
- Test that depends on the latin1 character set should include
default_charset.inc or set the character set to latin1
- Test that depends on the original optimizer switch, should include
default_optimizer_switch.inc
- Test that depends on the value of a specific system variable should
set it in the test (like optimizer_use_condition_selectivity)
- Split subselect3.test into subselect3.test and subselect3.inc to
make it easier to set and reset system variables.
- Added .opt files for test that required specfic options that could
be changed by external configuration files.
- Fixed result files in rockdsb & tokudb that had not been updated for
a while.
Problem was that tests select INFORMATION_SCHEMA.PROCESSLIST processes
from user system user and empty state. Thus, there is not clear
state for slave threads.
Changes:
- Added new status variables that store current amount of applier threads
(wsrep_applier_thread_count) and rollbacker threads
(wsrep_rollbacker_thread_count). This will make clear how many slave threads
of certain type there is.
- Added THD state "wsrep applier idle" when applier slave thread is
waiting for work. This makes finding slave/applier threads easier.
- Added force-restart option for mtr to always restart servers between tests
to avoid race on start of the test
- Added wait_condition_with_debug to wait until the passed statement returns
true, or the operation times out. If operation times out, the additional error
statement will be executed
Changes to be committed:
new file: mysql-test/include/force_restart.inc
new file: mysql-test/include/wait_condition_with_debug.inc
modified: mysql-test/mysql-test-run.pl
modified: mysql-test/suite/galera/disabled.def
modified: mysql-test/suite/galera/r/MW-336.result
modified: mysql-test/suite/galera/r/galera_kill_applier.result
modified: mysql-test/suite/galera/r/galera_var_slave_threads.result
new file: mysql-test/suite/galera/t/MW-336.cnf
modified: mysql-test/suite/galera/t/MW-336.test
modified: mysql-test/suite/galera/t/galera_kill_applier.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_largetrx.test
modified: mysql-test/suite/galera/t/galera_parallel_autoinc_manytrx.test
modified: mysql-test/suite/galera/t/galera_var_slave_threads.test
modified: mysql-test/suite/wsrep/disabled.def
modified: mysql-test/suite/wsrep/r/variables.result
modified: mysql-test/suite/wsrep/t/variables.test
modified: sql/mysqld.cc
modified: sql/wsrep_mysqld.cc
modified: sql/wsrep_mysqld.h
modified: sql/wsrep_thd.cc
modified: sql/wsrep_var.cc
- --default-character-set can now be disabled in mysqldump
- --skip-resolve can be be disabled in mysqld
- mysql_client_test now resets global variables it changes
- mtr couldn't handle [mysqldump] in config files (wrong regexp used)