Commit graph

222 commits

Author SHA1 Message Date
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Sergei Golubchik
8fd1b060f8 reformat galera sst error messages
put the command line at the end. so that when a very long command line
is truncated, it doesn't take the actual error message with it
2024-09-24 14:30:24 +02:00
Julius Goryavsky
45be538cf4 galera SST scripts: added missing 'datadir' parameter for mysqldump method 2024-09-15 06:47:35 +02:00
Alexey Yurchenko
7119149f83 If donor loop receives unknown signal from the SST script it is an
error condition (SST failure), so it should set error code before
exiting.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:54:05 +02:00
sjaakola
c89f769f24 MDEV-31905 GTID inconsistency
This commit fixes GTID inconsistency which was injected by mariabackup SST.
Donor node now writes new info file: donor_galera_info, which is streamed
along the mariabackup donation to the joiner node. The donor_galera_info
file contains both GTID and gtid domain_id, and joiner will use these to
initialize the GTID state.

Commit has new mtr test case: galera_3nodes.galera_gtid_consistency, which
exercises potentially harmful mariabackup SST scenarios. The test has also
scenario with IST joining.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-12-22 00:10:23 +01:00
Oleksandr Byelkin
6cfd2ba397 Merge branch '10.4' into 10.5 2023-11-08 12:59:00 +01:00
sjaakola
c7feacb0de 10.4-MDEV-31470 wsrep_sst_method variable validity checking
This commit checks the validity of value change of wsrep_sst_method variable.
The validity check is same as happens in donor node when incoming SST request
is parsed.

The commit has also a mtr test: wsrep.wsrep_variables_sst_method which verifies
that wsrep_sst_method can be succesfully changed to acceptable values and that
the SET command results in error if invalid value was entered.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-10-24 05:14:32 +02:00
Oleksandr Byelkin
f52954ef42 Merge commit '10.4' into 10.5 2023-07-20 11:54:52 +02:00
Jan Lindström
f102b595e8 MDEV-28433 : Server crashes when wsrep_sst_donor and wsrep_cluster_address set to NULL
Do not allow setting wsrep_sst_donor as NULL as it is
incorrect value. User can use value '' (default) that represents
same as NULL. Setting wsrep_cluster_address to NULL is
already handled correctly.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2023-05-12 02:48:16 +02:00
Oleksandr Byelkin
7fa02f5c0b Merge branch '10.4' into 10.5 2023-01-27 13:54:14 +01:00
Teemu Ollakka
beb1e230dd MDEV-30419 Fix unhandled exception thrown from wsrep-lib
Updated wsrep-lib to version in which server_state
wait_until_state() and sst_received() were changed to report
errors via return codes instead of throwing exceptions. Added
error handling accordingly.

Tested manually that failure in sst_received() which was
caused by server misconfiguration (unknown configuration variable
in server configuration) does not cause crash due to uncaught
exception.
2023-01-19 14:55:50 +02:00
Marko Mäkelä
6286a05d80 Merge 10.4 into 10.5 2022-09-26 13:34:38 +03:00
Marko Mäkelä
3c92050d1c Fix build without either ENABLED_DEBUG_SYNC or DBUG_OFF
There are separate flags DBUG_OFF for disabling the DBUG facility
and ENABLED_DEBUG_SYNC for enabling the DEBUG_SYNC facility.
Let us allow debug builds without DEBUG_SYNC.

Note: For CMAKE_BUILD_TYPE=Debug, CMakeLists.txt will continue to
define ENABLED_DEBUG_SYNC.
2022-09-23 17:37:52 +03:00
Jan Lindström
ba987a46c9 Merge 10.4 into 10.5 2022-09-05 13:28:56 +03:00
Daniele Sciascia
2917bd0d2c Reduce compilation dependencies on wsrep_mysqld.h
Making changes to wsrep_mysqld.h causes large parts of server code to
be recompiled. The reason is that wsrep_mysqld.h is included by
sql_class.h, even tough very little of wsrep_mysqld.h is needed in
sql_class.h. This commit introduces a new header file, wsrep_on.h,
which is meant to be included from sql_class.h, and contains only
macros and variable declarations used to determine whether wsrep is
enabled.
Also, header wsrep.h should only contain definitions that are also
used outside of sql/. Therefore, move WSREP_TO_ISOLATION* and
WSREP_SYNC_WAIT macros to wsrep_mysqld.h.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-08-31 11:05:23 +03:00
Marko Mäkelä
ea847cbeaf Merge 10.4 into 10.5 2022-06-27 10:51:20 +03:00
Marko Mäkelä
01d757036f Merge 10.3 into 10.4 2022-06-27 10:14:37 +03:00
Julius Goryavsky
124326d810 MDEV-28656: Inability to roll upgrade without stopping the Galera cluster 2022-06-14 12:29:14 +02:00
Marko Mäkelä
620c55e708 Merge 10.4 into 10.5 2022-04-21 15:33:50 +03:00
Marko Mäkelä
394784095e Merge 10.3 into 10.4 2022-04-21 11:33:59 +03:00
Julius Goryavsky
42908dc5fb MDEV-26171: wsrep_sst_receive_address does not parse IPv6 address correctly
This commit fixes problems with parsing ipv6 addresses given via
the wsrep_sst_receive_address and wsrep_node_address options.

Also, this commit removes extra lines in the configuration files
in the mtr test suites for Galera related to these parameters.
2022-04-12 17:14:39 +02:00
Marko Mäkelä
5d8dcfd86c MDEV-25975: Merge 10.4 into 10.5 2022-04-06 10:30:49 +03:00
Marko Mäkelä
d172df9913 MDEV-25975: Merge 10.3 into 10.4 2022-04-06 09:18:38 +03:00
Marko Mäkelä
e9735a8185 MDEV-25975 innodb_disallow_writes causes shutdown to hang
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.

During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:

sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.

sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.

The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.

Reviewed by: Jan Lindström
2022-04-06 08:06:49 +03:00
Marko Mäkelä
d62b0368ca Merge 10.4 into 10.5 2022-03-29 12:59:18 +03:00
Marko Mäkelä
ae6e214fd8 Merge 10.3 into 10.4 2022-03-29 11:13:18 +03:00
sjaakola
9b2fa2ae8e MDEV-24845 Oddities around innodb_fatal_semaphore_wait_threshold and global.innodb_disallow_writes
This commit adds a mtr test for reproducing a test scenario where despite of
innodb_disallow_writes blocking, writes to file system can still happen.

The test launches a garbd node, which triggers one of the cluster node to switch to
SST donor state. In this state, all disk activity should be halted, and e.g.
innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb
data directory when the node enters the donor state, and records another md5sum
when the node leaves the donor state. If there is no IO activity in data directory, these
hashes should be equal.

For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be
stopped when entering the donor state. The test uses this new dbug sync point,
to control when to record the md5sums.

New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor
node to call this script, and to enter in donor state.

The backup script could be later extended as general purpose backup method for the cluster.

This commit fixes also one race condition happening in wsrep_sst_rsync, like this:
* wsrep_rsync_sst script requests for flush tables,
  and then waits in a loop until mariadbd has created file tables_flushed,
  as confirmation that FLUSH TABLES has completed
* mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL,
  and after this it creates the tables_flushed file
* note that SST script will now continue to startup rsync sending
* mariadbd's SST donor thread now calls for sst_disallow_writes(),
  so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point

This race condition is fixed in this commit, by performing all disk IO blocking before
creating the tables_flushed file.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-25 10:04:15 +02:00
Julius Goryavsky
55bb933a88 Merge branch 10.4 into 10.5 2021-12-26 12:51:04 +01:00
Alexey Yurchenko
5c8e628dda wsrep-lib update: bugfixes, cleanups, event API, state transition cleanups
Don't switch state to DONOR in `wsrep_sst_donate()` - wsrep-lib does it now
2021-12-08 13:16:23 +02:00
Marko Mäkelä
15dcb8bd3e Merge 10.4 into 10.5 2021-07-02 13:02:26 +03:00
Jan Lindström
1c03e7a667 MDEV-25978 : rsync SST does not work with custom binlog name
wsrep_sst_common did not correctly set name for binlog index
file if custom binlog name was used and this name was
not added to script command line.

Added test case for both log_basename and log_binlog.
2021-06-25 21:23:04 +02:00
Jan Lindström
05a4996c5c MDEV-25978 : rsync SST does not work with custom binlog name
wsrep_sst_common did not correctly set name for binlog index
file if custom binlog name was used and this name was
not added to script command line.

Added test case for both log_basename and log_binlog.
2021-06-25 07:15:00 +02:00
Julius Goryavsky
740917620a MDEV-25693: SST failed due to incorrect connection address
Fixed bugs caused by inaccuracies in automatic merging
from other branches:

1) Authentication information is not removed from the connection
   address, which causes some tests to fail;
2) wsrep_debug=on should be replaced with wsrep_debug=1;
3) Added missing "connection" lines to test result file;
4) Some tests have been corrected for Galera 4.x (10.4+).
2021-05-17 20:33:55 +02:00
Julius Goryavsky
527675d53a MDEV-25669: SST scripts should check all server groups in config files
1) This commit implements reading all sections from configuration
files while looking for the current value of any server variable,
which were previously only read from the [mysqld.suffix] group and
from [mysqld], but not from other groups such as [mariadb.suffix],
[mariadb] or, for example, [server].

2) This commit also fixes misrecognition of some parameters when
parsing a command line containing a special marker for the end
of the list of options ("--") or when short option names (such
as "-s", "-a" and "-h arg") chained together (like a "-sah arg").
Such parameters can be passed to the SST script in the list of
arguments after "--mysqld-args" if the server is started with a
complex set of options - this was revealed during manual testing
of changes to read configuration files.

3) The server-side preparation code for the "--mysqld-args"
option list has also been simplified to make it easier to change
in the future (if needed), and has been improved to properly
handle the special backquote ("`") character in the argument
values.
2021-05-17 20:33:06 +02:00
Julius Goryavsky
e861e057ad MDEV-25693: SST failed due to incorrect connection address
Fixed bugs caused by inaccuracies in automatic merging
from other branches:

1) Authentication information is not removed from the connection
   address, which causes some tests to fail;
2) wsrep_debug=on should be replaced with wsrep_debug=1;
3) Added missing "connection" lines to test result file;
4) Some tests have been corrected for Galera 4.x (10.4+).
2021-05-17 19:51:49 +02:00
Julius Goryavsky
f92cd0c56b MDEV-25669: SST scripts should check all server groups in config files
1) This commit implements reading all sections from configuration
files while looking for the current value of any server variable,
which were previously only read from the [mysqld.suffix] group and
from [mysqld], but not from other groups such as [mariadb.suffix],
[mariadb] or, for example, [server].

2) This commit also fixes misrecognition of some parameters when
parsing a command line containing a special marker for the end
of the list of options ("--") or when short option names (such
as "-s", "-a" and "-h arg") chained together (like a "-sah arg").
Such parameters can be passed to the SST script in the list of
arguments after "--mysqld-args" if the server is started with a
complex set of options - this was revealed during manual testing
of changes to read configuration files.

3) The server-side preparation code for the "--mysqld-args"
option list has also been simplified to make it easier to change
in the future (if needed), and has been improved to properly
handle the special backquote ("`") character in the argument
values.
2021-05-17 15:08:40 +02:00
Julius Goryavsky
f9f8e33f29 MDEV-25669: SST scripts should check all server groups in config files
1) This commit implements reading all sections from configuration
files while looking for the current value of any server variable,
which were previously only read from the [mysqld.suffix] group and
from [mysqld], but not from other groups such as [mariadb.suffix],
[mariadb] or, for example, [server].

2) This commit also fixes misrecognition of some parameters when
parsing a command line containing a special marker for the end
of the list of options ("--") or when short option names (such
as "-s", "-a" and "-h arg") chained together (like a "-sah arg").
Such parameters can be passed to the SST script in the list of
arguments after "--mysqld-args" if the server is started with a
complex set of options - this was revealed during manual testing
of changes to read configuration files.

3) The server-side preparation code for the "--mysqld-args"
option list has also been simplified to make it easier to change
in the future (if needed), and has been improved to properly
handle the special backquote ("`") character in the argument
values.
2021-05-17 14:58:49 +02:00
Marko Mäkelä
0e1437e147 Merge 10.4 into 10.5 2021-05-10 10:01:15 +03:00
Marko Mäkelä
8c73fab7f7 Merge 10.3 into 10.4 2021-05-10 09:52:01 +03:00
Marko Mäkelä
98e6159892 Merge 10.2 into 10.3 2021-05-10 09:09:50 +03:00
Alexey Yurchenko
54d7ba9609 MDEV-25418: Improve mariabackup SST script compliance with native MariaDB SSL practices
and configuration.

1. Pass joiner's authentication information to donor together with address
   in State Transfer Request. This allows joiner to authenticate donor on
   connection. Previously joiner would accept data from anywhere.

2. Deprecate custom SSL configuration variables tca, tcert and tkey in favor
   of more familiar ssl-ca, ssl-cert and ssl-key. For backward compatibility
   tca, tcert and tkey are still supported.

3. Allow falling back to server-wide SSL configuration in [mysqld] if no SSL
   configuration is found in [sst] section of the config file.

4. Introduce ssl-mode variable in [sst] section that takes standard values
   and has following effects:
    - old-style SSL configuration present in [sst]: no effect
      otherwise:
    - ssl-mode=DISABLED or absent: retains old, backward compatible behavior
      and ignores any other SSL configuration
    - ssl-mode=VERIFY*: verify joiner's certificate and CN on donor,
                        verify donor's secret on joiner
                        (passed to donor via State Transfer Request)
                        BACKWARD INCOMPATIBLE BEHAVIOR
    - anything else enables new SSL configuration convetions but does not
      require verification

    ssl-mode should be set to VERIFY only in a fully upgraded cluster.

    Examples:

    [mysqld]
    ssl-cert=/path/to/cert
    ssl-key=/path/to/key
    ssl-ca=/path/to/ca

    [sst]

     -- server-wide SSL configuration is ignored, SST does not use SSL

    [mysqld]
    ssl-cert=/path/to/cert
    ssl-key=/path/to/key
    ssl-ca=/path/to/ca

    [sst]
    ssl-mode=REQUIRED

     -- use server-wide SSL configuration for SST but don't attempt to
        verify the peer identity

    [sst]
    ssl-cert=/path/to/cert
    ssl-key=/path/to/key
    ssl-ca=/path/to/ca
    ssl-mode=VERIFY_CA

     -- use SST-specific SSL configuration for SST and require verification
        on both sides

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2021-05-06 04:03:07 +02:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Jan Lindström
57caff245c MDEV-25423 : Donor node fails to shutdown after mysqldump SST
* Table should have primary key
* Enable wsrep_sync_wait before final selects
* Enable autocommit before final selects.
* Fix joiner monitoring in case of mysqldump.
* Add wait_conditions to stabilize
2021-04-19 12:12:18 +03:00
Sergei Golubchik
f33e57a9e6 Merge branch '10.4' into 10.5 2021-02-23 13:06:22 +01:00
Sergei Golubchik
e841957416 Merge branch '10.3' into 10.4 2021-02-23 09:25:57 +01:00
Sergei Golubchik
0ab1e3914c Merge branch '10.2' into 10.3 2021-02-22 22:42:27 +01:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Julius Goryavsky
95003eab45 MDEV-19950: Galera test failure on galera_ssl_upgrade
The test requires adaptation for MariaDB, which is done
in this patch. In addition, this patch includes a fix for
the SST script startup code that adds escaping for special
characters on the command line (in case they are contained
in the arguments to mysqld). The fix does not require
separate tests, as the required tests are already part
of the mtr suite for Galera.
2021-02-11 13:44:01 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00