MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
/* Copyright 2024 MariaDB corporation AB
|
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; version 2 of the License.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; if not, write to the Free Software
|
|
|
|
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1335 USA
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
|
|
/*
|
|
|
|
Get start and end of stack
|
|
|
|
|
|
|
|
Note that the code depends on STACK_DIRECTION that is defined by cmake.
|
|
|
|
In general, STACK_DIRECTION should be 1 except in case of
|
|
|
|
an upward-growing stack that can happen on sparc or hpux platforms.
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
|
|
#include "my_global.h"
|
|
|
|
#include "my_sys.h"
|
|
|
|
#include "my_stack_alloc.h"
|
|
|
|
|
|
|
|
#ifdef _WIN_
|
|
|
|
#include <windows.h>
|
|
|
|
#include <processthreadsapi.h>
|
|
|
|
#include <winnt.h>
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Get start and end of stack */
|
|
|
|
|
|
|
|
extern void my_get_stack_bounds(void **stack_start, void **stack_end,
|
|
|
|
void *fallback_stack_start,
|
|
|
|
size_t fallback_stack_size)
|
|
|
|
{
|
|
|
|
size_t stack_size;
|
Fixup bddbef357334
For Windows, the method of finding stack limit is reportedly flaky,
and might not work as desired, as documented in
https://joeduffyblog.com/2006/07/15/checking-for-sufficient-stack-space
"Unfortunately, the StackLimit is only updated as you actually touch pages on
the stack, and thus it’s not a reliable way to find out how much
uncommitted stack is left."
Thus, Windows specific code was removed. It might be added, if we find out
that we need it, so far there was no need.
Also AIX, the code based on HAVE_PTHREAD_GETATTR_NP was found not to work,
(produce false positives of stack overrun), thus the traditional
fallback code is used.
Also
- removed repetitive fallback code
- fixed non-portable void pointer arithmethics (GCC-ism)
- took into account that pthread_attr_getstack() can fail,
- fixed the code for (less common) STACK_DIRECTION > 0.
- removed confusing/wrong comments about what "stack base address" means
Single Unix Spec, AIX documentation make it clear what that is.
2024-10-30 21:29:21 +01:00
|
|
|
#if defined(HAVE_PTHREAD_GETATTR_NP) && !defined(_AIX)
|
MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
/* POSIX-compliant system (Linux, macOS, etc.) */
|
|
|
|
pthread_attr_t attr;
|
|
|
|
pthread_t thread= pthread_self();
|
|
|
|
|
|
|
|
/* Get the thread attributes */
|
|
|
|
if (pthread_getattr_np(thread, &attr) == 0)
|
|
|
|
{
|
|
|
|
/* Get stack base and size */
|
Fixup bddbef357334
For Windows, the method of finding stack limit is reportedly flaky,
and might not work as desired, as documented in
https://joeduffyblog.com/2006/07/15/checking-for-sufficient-stack-space
"Unfortunately, the StackLimit is only updated as you actually touch pages on
the stack, and thus it’s not a reliable way to find out how much
uncommitted stack is left."
Thus, Windows specific code was removed. It might be added, if we find out
that we need it, so far there was no need.
Also AIX, the code based on HAVE_PTHREAD_GETATTR_NP was found not to work,
(produce false positives of stack overrun), thus the traditional
fallback code is used.
Also
- removed repetitive fallback code
- fixed non-portable void pointer arithmethics (GCC-ism)
- took into account that pthread_attr_getstack() can fail,
- fixed the code for (less common) STACK_DIRECTION > 0.
- removed confusing/wrong comments about what "stack base address" means
Single Unix Spec, AIX documentation make it clear what that is.
2024-10-30 21:29:21 +01:00
|
|
|
void *low_addr, *high_addr= NULL;
|
|
|
|
if (pthread_attr_getstack(&attr, &low_addr, &stack_size) == 0)
|
|
|
|
{
|
|
|
|
high_addr= (char *) low_addr + stack_size;
|
|
|
|
#if STACK_DIRECTION < 0
|
|
|
|
*stack_start= high_addr;
|
|
|
|
*stack_end= low_addr;
|
|
|
|
#else
|
|
|
|
*stack_start= low_addr;
|
|
|
|
*stack_end= high_addr;
|
|
|
|
#endif
|
|
|
|
}
|
MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
pthread_attr_destroy(&attr); /* Clean up */
|
Fixup bddbef357334
For Windows, the method of finding stack limit is reportedly flaky,
and might not work as desired, as documented in
https://joeduffyblog.com/2006/07/15/checking-for-sufficient-stack-space
"Unfortunately, the StackLimit is only updated as you actually touch pages on
the stack, and thus it’s not a reliable way to find out how much
uncommitted stack is left."
Thus, Windows specific code was removed. It might be added, if we find out
that we need it, so far there was no need.
Also AIX, the code based on HAVE_PTHREAD_GETATTR_NP was found not to work,
(produce false positives of stack overrun), thus the traditional
fallback code is used.
Also
- removed repetitive fallback code
- fixed non-portable void pointer arithmethics (GCC-ism)
- took into account that pthread_attr_getstack() can fail,
- fixed the code for (less common) STACK_DIRECTION > 0.
- removed confusing/wrong comments about what "stack base address" means
Single Unix Spec, AIX documentation make it clear what that is.
2024-10-30 21:29:21 +01:00
|
|
|
if (high_addr)
|
|
|
|
return;
|
MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
}
|
Fixup bddbef357334
For Windows, the method of finding stack limit is reportedly flaky,
and might not work as desired, as documented in
https://joeduffyblog.com/2006/07/15/checking-for-sufficient-stack-space
"Unfortunately, the StackLimit is only updated as you actually touch pages on
the stack, and thus it’s not a reliable way to find out how much
uncommitted stack is left."
Thus, Windows specific code was removed. It might be added, if we find out
that we need it, so far there was no need.
Also AIX, the code based on HAVE_PTHREAD_GETATTR_NP was found not to work,
(produce false positives of stack overrun), thus the traditional
fallback code is used.
Also
- removed repetitive fallback code
- fixed non-portable void pointer arithmethics (GCC-ism)
- took into account that pthread_attr_getstack() can fail,
- fixed the code for (less common) STACK_DIRECTION > 0.
- removed confusing/wrong comments about what "stack base address" means
Single Unix Spec, AIX documentation make it clear what that is.
2024-10-30 21:29:21 +01:00
|
|
|
#endif
|
|
|
|
/* Platform does not have pthread_getattr_np, or fallback */
|
MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
*stack_start= my_get_stack_pointer(fallback_stack_start);
|
|
|
|
stack_size= (fallback_stack_size -
|
|
|
|
MY_MIN(fallback_stack_size, MY_STACK_SAFE_MARGIN));
|
Fixup bddbef357334
For Windows, the method of finding stack limit is reportedly flaky,
and might not work as desired, as documented in
https://joeduffyblog.com/2006/07/15/checking-for-sufficient-stack-space
"Unfortunately, the StackLimit is only updated as you actually touch pages on
the stack, and thus it’s not a reliable way to find out how much
uncommitted stack is left."
Thus, Windows specific code was removed. It might be added, if we find out
that we need it, so far there was no need.
Also AIX, the code based on HAVE_PTHREAD_GETATTR_NP was found not to work,
(produce false positives of stack overrun), thus the traditional
fallback code is used.
Also
- removed repetitive fallback code
- fixed non-portable void pointer arithmethics (GCC-ism)
- took into account that pthread_attr_getstack() can fail,
- fixed the code for (less common) STACK_DIRECTION > 0.
- removed confusing/wrong comments about what "stack base address" means
Single Unix Spec, AIX documentation make it clear what that is.
2024-10-30 21:29:21 +01:00
|
|
|
*stack_end= (char *)(*stack_start) + stack_size * STACK_DIRECTION;
|
MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-01 16:07:48 +02:00
|
|
|
}
|