The reason for the crash was the code assumed that
SELECT_LEX.ref_pointer_array would be initialized with zero, which was
not the case. This cause the test of
if (!select->ref_pointer_array[counter]) in item.cc to be unpredictable
and causes crashes.
Fixed by zero-filling ref_pointer_array on allocation.
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.
It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.
The following code shows the issue
Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
(connect=0x5080000027b8,
put_in_cache=<optimized out>) at sql/sql_connect.cc:1399
THD *thd;
1399 thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0
The address of thd is 24M away from the stack pointer
(gdb) info reg
...
rsp 0x7fffef4e7bc0 0x7fffef4e7bc0
...
r13 0x7fffedee7060 140737185214560
r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer
I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.
To solve this issue in a portable way, I have added two functions:
my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()
As a fallback for other compilers/arch we use the address of a local
variable.
my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.
Server changes are:
- Moving setting of thread_stack to THD::store_globals() using
my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
allocates a lot on the stack before calling store_globals(). When
using estimates for stack start, we reduce stack_size with
MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
before calling store_globals().
I also added a unittest, stack_allocation-t, to verify the new code.
Reviewed-by: Sergei Golubchik <serg@mariadb.org>
Remove workaround for MDEV-13941, it served for 5 years,and all affected
pre-release 10.2 installation should have been already fixed in between.
Apparently Innodb is using is_sparse parameter in os_file_set_size()
inconsistently, and it passes is_sparse=false now during first file
extension. With MDEV-13941 workaround in place, it would unsparse
the file, which is makes compression not to work at all anymore.
In commit b7b9f3ce82 (MDEV-34515) we
accidentally made the InnoDB MVCC code acquire a shared
purge_sys.latch twice. Recursive shared latch acquisition may cause a
deadlock of InnoDB threads if another thread in between will start waiting
for an exclusive latch.
purge_sys_t::latch: In debug builds, use srw_lock_debug instead of
srw_spin_lock, so that bugs like this will result in debug assertion
failures.
trx_undo_report_row_operation(): Pass the view_guard to
trx_undo_prev_version() and the rest of the arguments in the same
order, so that the work to permute argument registers is minimized.
Implement variable legacy_xa_rollback_at_disconnect to support
backwards compatibility for applications that rely on the pre-10.5
behavior for connection disconnect, which is to rollback the
transaction (in violation of the XA specification).
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
If argv0 is null to my_get_exe then don't
let the fallback in my_realpath handle this.
This was the cause of the original SEGV on NetBSD
so we don't want this to be the path for any
unimplemented ports of this function.
Set select_thread_in_use only when we're about to enter into the polling
loop, not sooner, allowing early proces aborts to exist cleanly: the
process won't be waiting for a polling loop that isn't yet polling.
Pre-11.0 variant:
1. In recompute_join_cost_with_limit(), add an assertion that that
partial_join_cost >= 0.0.
2. best_extension_by_limited_search() subtracts COST_EPS from
join->best_read. But it is not subtracted from
join->positions[0].read_time, add it back.
2. We could get very small negative partial_join_cost due to rounding
errors. For fraction=1.0, we were computing essentially this (denote
as EXPR-1):
$row_read_cost + $where_cost - ($row_read_cost + $where_cost)
which should compute to 0.
But the computation was done in the following order (left-to-right):
EXPR-2:
($row_read_cost + $where_cost) - $row_read_cost - $where_cost
this produced a value of -1.1102230246251565e-16 due to a rounding
error. Change the computation use EXPR-1 instead of EXPR-2.
optimize_straight_join and best_extension_by_limited_search()
use 0.001 to make choice between plans with identical cost deterministic.
Use COST_EPS instead of 0.001, like it's done in newer versions.
Problem:
=======
- Redundant table fails to insert into the table after
instant drop blob column. Instant drop column only marking
the column as hidden and consecutive insert statement tries
to insert NULL value for the dropped BLOB column and returns
the fixed length of the blob type as 65535. This lead to
row size too large error.
Fix:
====
For redundant table, if the non-fixed dropped column can be null
then set the length of the field type as 0.
Stop skipping const items when selecting but skip them when storing
their results to spider row to avoid storing in mismatching temporary
table fields.
Skip auxiliary fields in SELECTing, and do not store
the (non-existing) results to the corresponding temporary table
accordingly.
When there are BOTH auxiliary fields AND const items in the auxiliary
field items, do not use the spider GBH. This is a rare occasion if it
happens at all and not worth the added complexity to cover it.
Use the original item (item_ptr) in constructing GROUP BY and ORDER
BY, which also means using item->name instead of field->field_name as
aliases in constructing SELECT items. This fixes spurious regressions
caused by the above changes in some tests using ORDER BY, such as
mdev_24517.test. As a by-product, this also fixes MDEV-29546.
Therefore we update mdev_29008.test to include the MDEV-29546 case.
Remove the dead-code, in Spider, which is related to the Spider's
Oracle OCI support. The code has been disabled for a long time and
it is unlikely that the code will be enabled.
During spider query construction of certain cast functions, it
locates the last occurrence of a keyword in the output of the
Item::print() function and append from there to the constructed query
so far. For example, consider the following query
SELECT * FROM t2 ORDER BY CAST(c AS INET6);
It constructs the following query and executes it at the data
node (assuming the data node table is called t0).
select cast(t0.`c` as inet6) ``,t0.`c` `c` from `test`.`t1` t0 order by ``
When the construction has completed the initial part
select cast(t0.`c`
It then attempts to construct the " as inet6" part. To that end, it
calls print() on the Item_typecast_fbt corresponding to the cast item,
and obtains
cast(`test`.`t2`.`c` as inet6)
It then looks for " as ", and places cursor there for appending:
cast(`test`.`t2`.`c` as inet6)
^
In this patch, if the search fails, i.e. there's no " as ...", we
make sure that the cursor is not placed before the beginning of the
string (out of bound).
We also relax the search from " as char" to " as " in the case of
CHAR_TYPECAST_FUNC, since there is more than one Item type with this
func type. For example, "AS INET6" is an Item_typecast_fbt which has
this func type.
When an DDL statement results in a local partition table with
partitions not covering all values in the table, a failure is emitted.
However, when the table in question is a spider table, the issue does
not surface until some future statements (DELETE in the test examples
in this commit) are executed. This is consistent with the design of
spider which aims to minimise connections with the data node. The
resulting error is legitimate and should not result in an assertion
failure. Similarly, a partitioned spider table could have misplaced
rows, so we remove the other assertion as well.
- document tmp_share, which are temporary spider shares with only one
link (no ha)
- simplify spider_get_sys_tables_connect_info() where link_idx is
always 0
- InnoDB fails to set the index information or index number
for the spatial index error HA_ERR_NULL_IN_SPATIAL.
row_build_spatial_index_key(): Initialize the tmp_mbr array completely.
check_if_supported_inplace_alter(): Fix the spelling mistake of alter
Sometimes, in MariaDB Server 10.5 but apparently not in later branches,
the test would hang because con1 and con2 would be blocked in
debug_sync (for example, lock_wait_suspend_thread_enter and
row_ins_sec_index_entry_dup_locks_created) and therefore blocking
the purge of transactions from completing.
To prevent an occasional DEBUG_SYNC induced hang in the test, we will
wait for everything to be purged, except the last 2 transactions.
This change should be null-merged to 10.6, because the test is not
failing in 10.6 or later major versions.
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).
Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.
Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
Hash index is vcol-based wrapper (MDEV-371). row_end is added to
unique index. So when row_end is updated unique hash index must be
recalculated via vcol_update_fields(). DELETE did not update virtual
fields, so DELETE HISTORY was getting wrong hash value.
The fix does update_virtual_fields() on vers_update_end() so in every
case row_end is updated virtual fields are updated as well.