The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
This commit fixes GTID inconsistency which was injected by mariabackup SST.
Donor node now writes new info file: donor_galera_info, which is streamed
along the mariabackup donation to the joiner node. The donor_galera_info
file contains both GTID and gtid domain_id, and joiner will use these to
initialize the GTID state.
Commit has new mtr test case: galera_3nodes.galera_gtid_consistency, which
exercises potentially harmful mariabackup SST scenarios. The test has also
scenario with IST joining.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
This commit checks the validity of value change of wsrep_sst_method variable.
The validity check is same as happens in donor node when incoming SST request
is parsed.
The commit has also a mtr test: wsrep.wsrep_variables_sst_method which verifies
that wsrep_sst_method can be succesfully changed to acceptable values and that
the SET command results in error if invalid value was entered.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Do not allow setting wsrep_sst_donor as NULL as it is
incorrect value. User can use value '' (default) that represents
same as NULL. Setting wsrep_cluster_address to NULL is
already handled correctly.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
Updated wsrep-lib to version in which server_state
wait_until_state() and sst_received() were changed to report
errors via return codes instead of throwing exceptions. Added
error handling accordingly.
Tested manually that failure in sst_received() which was
caused by server misconfiguration (unknown configuration variable
in server configuration) does not cause crash due to uncaught
exception.
There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.
Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
This commit fixes problems with parsing ipv6 addresses given via
the wsrep_sst_receive_address and wsrep_node_address options.
Also, this commit removes extra lines in the configuration files
in the mtr test suites for Galera related to these parameters.
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.
During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:
sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.
sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.
The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.
Reviewed by: Jan Lindström
This commit adds a mtr test for reproducing a test scenario where despite of
innodb_disallow_writes blocking, writes to file system can still happen.
The test launches a garbd node, which triggers one of the cluster node to switch to
SST donor state. In this state, all disk activity should be halted, and e.g.
innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb
data directory when the node enters the donor state, and records another md5sum
when the node leaves the donor state. If there is no IO activity in data directory, these
hashes should be equal.
For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be
stopped when entering the donor state. The test uses this new dbug sync point,
to control when to record the md5sums.
New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor
node to call this script, and to enter in donor state.
The backup script could be later extended as general purpose backup method for the cluster.
This commit fixes also one race condition happening in wsrep_sst_rsync, like this:
* wsrep_rsync_sst script requests for flush tables,
and then waits in a loop until mariadbd has created file tables_flushed,
as confirmation that FLUSH TABLES has completed
* mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL,
and after this it creates the tables_flushed file
* note that SST script will now continue to startup rsync sending
* mariadbd's SST donor thread now calls for sst_disallow_writes(),
so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point
This race condition is fixed in this commit, by performing all disk IO blocking before
creating the tables_flushed file.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
wsrep_sst_common did not correctly set name for binlog index
file if custom binlog name was used and this name was
not added to script command line.
Added test case for both log_basename and log_binlog.
wsrep_sst_common did not correctly set name for binlog index
file if custom binlog name was used and this name was
not added to script command line.
Added test case for both log_basename and log_binlog.
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)
Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)