Commit graph

198354 commits

Author SHA1 Message Date
Marko Mäkelä
a687cf8661 Merge 10.5 into 10.6 2024-06-07 10:03:51 +03:00
Julius Goryavsky
238798d978 MDEV-32158: wsrep_sst_mariabackup use /tmp dir during SST rather then user defined tmpdir
wsrep_sst_mariabackup should use the tmpdir defined by
the user under the '[mysqld]' section of the configuration
file rather than the default '/tmp' directory.
2024-06-06 20:24:13 +02:00
Julius Goryavsky
654f6ecec4 galera: wsrep-lib submodule update 2024-06-06 19:37:31 +02:00
Julius Goryavsky
c2d9762011 mtr: сhange the default setting for the port group size parameter
Some galera tests starts 6 galera nodes. Each galera node requires
three ports: 6*3 = 18. Plus 6 ports are needed for 6 mariadbd servers.
Since the number of ports is rounded up to 10 everywhere in mtr, we
will take 30 as the default value for the port group size parameter.
2024-06-06 19:31:28 +02:00
Daniele Sciascia
c1dc03974b MDEV-33523 Spurious deadlock error when wsrep_on=OFF
Avoid starting transactions in wsrep-lib side when wsrep is
disabled. It is unnecessary, and causes spurious deadlock errors on
transaction clean up.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-06 19:19:34 +02:00
Jan Lindström
d328705a12 MDEV-34170 : table gtid_slave_pos entries never been deleted with wsrep_gtid_mode = 0
Problem was that updates to mysql.gtid_slave_pos table were
replicated even when they were newer used and because that
newer deleted. Avoid replication of mysql.gtid_slave_pos
table if wsrep_gtid_mode=OFF.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-06 19:19:34 +02:00
Thirunarayanan Balathandayuthapani
a02773f7c0 MDEV-34057 Inconsistent FTS state in concurrent scenarios
Problem:
=======
- This commit is a merge of mysql commit 129ee47ef994652081a11ee9040c0488e5275b14.
InnoDB FTS can be in inconsistent state when sync operation
terminates the server before committing the operation. This
could lead to incorrect synced doc id and incorrect query results.

Solution:
========
- During sync commit operation, InnoDB should pass
the sync transaction to update the max doc id
in the config table.

fts_read_synced_doc_id() : This function is used
to read only synced doc id from the config table.
2024-06-06 19:09:13 +05:30
Marko Mäkelä
699d38d951 MDEV-34296 extern thread_local is a CPU waste
In commit 99bd22605938c42d876194f2ec75b32e658f00f5 (MDEV-31558)
we wrongly thought that there would be minimal overhead for accessing
a thread-local variable mariadb_stats.

It turns out that in C++11, each access to an extern thread_local
variable requires conditionally invoking an initialization function.
In fact, the initializer expression of mariadb_stats is dynamic, and
those calls were actually unavoidable.

In C++20, one could declare constinit thread_local variables, but
the address of a thread_local variable (&mariadb_dummy_stats) is not
a compile-time constant. We did not want to declare mariadb_dummy_stats
without thread_local, because then the dummy accesses could lead to
cache line contention between threads.

mariadb_stats: Declare as __thread or __declspec(thread) so that
there will be no dynamic initialization, but zero-initialization.

mariadb_dummy_stats: Remove. It is a lesser evil to let
the environment perform zero-initialization and check if
!mariadb_stats.

Reviewed by: Sergei Petrunia
2024-06-06 14:38:42 +03:00
Marko Mäkelä
9fac857f26 MDEV-34283 A misplaced btr_cur_need_opposite_intention() check may fail to prevent hangs
btr_cur_t::search_leaf(): Invoke btr_cur_need_opposite_intention() after
positioning page_cur.rec so that the record will be in the intended page.
This is something that was broken in
commit f2096478d5 or
commit de4030e4d4 or related changes.

btr_cur_need_opposite_intention(): Add a debug assertion that would
catch the misuse.

The "next line of defence" that should have caught this bug in debug builds
are assertions that mtr_t::m_memo contains MTR_MEMO_X_LOCK for the
dict_index_t::lock. When btr_cur_need_opposite_intention() holds,
we should escalate to acquiring an exclusive index->lock in
btr_cur_t::pessimistic_search_leaf().

Reviewed by: Debarun Banerjee
2024-06-06 13:03:34 +03:00
Marko Mäkelä
bc3660925d MDEV-34307 On startup, [FATAL] InnoDB: Page ... still fixed or dirty
buf_pool_invalidate(): Properly wait for
os_aio_wait_until_no_pending_writes() to ensure so that there
are no pending buf_page_t::write_complete() or buf_page_write_complete()
operations. This will avoid a failure of buf_pool.assert_all_freed().

This bug should affect debug builds only. At this point, the
buf_pool.flush_list should be clear and all changes should have
been written out. The loop around buf_LRU_scan_and_free_block() should
have eventually completed and freed all pages as soon as
buf_page_t::write_complete() had a chance to release the page latches.

It is worth noting that buf_flush_wait() is working as intended.
As soon as buf_flush_page_cleaner() invokes
buf_pool.get_oldest_modification() it will observe that
buf_page_t::write_complete() had assigned oldest_modification_ to 1,
and remove such blocks from buf_pool.flush_list. Upon reaching
buf_pool.flush_list.count=0 the buf_flush_page_cleaner() will mark
itself idle and wake buf_flush_wait() by broadcasting
buf_pool.done_flush_list.

This regression was introduced in
commit a55b951e60 (MDEV-26827).

Reviewed by: Debarun Banerjee
2024-06-06 10:18:42 +03:00
Rucha Deodhar
0406b2a4ed MDEV-34143: Server crashes when executing JSON_EXTRACT after setting
non-default collation_connection

Analysis:
Due to different collation, the string has nothing to chop off.

Fix:
Got rid of chop(), only append " ," only when we have more elements to
add to the result.
2024-06-06 11:41:01 +05:30
Vladislav Vaintroub
ce9efb4e02 MDEV-34296 tpool - declare thread_local_waiter "static thread_local" 2024-06-05 16:55:11 +02:00
Nikita Malyavin
7d86751de5 mtr: run check-testcase client process under debugger 2024-06-05 16:50:51 +02:00
mariadb-DebarunBanerjee
b12c14e3b4 MDEV-34265 Possible hang during IO burst with innodb_flush_sync enabled
When checkpoint age goes beyond the sync flush threshold and
buf_flush_sync_lsn is set, page cleaner enters into "furious flush"
stage to aggressively flush dirty pages from flush list and pull
checkpoint LSN above safe margin. In this stage, page cleaner skips
doing LRU flush and eviction.

In 10.6, all other threads entirely rely on page cleaner to generate
free pages. If free pages get over while page cleaner is busy in
"furious flush" stage, a session thread could wait for free page in the
middle of a min-transaction(mtr) while holding latches on other pages.

It, in turn, can prevent page cleaner to flush such pages preventing
checkpoint LSN to move forward creating a deadlock situation. Even
otherwise, it could create a stall and hang like situation for large BP
with plenty of dirty pages to flush before the stage could finish.

Fix: During furious flush, check and evict LRU pages after each flush
iteration.
2024-06-05 18:11:29 +05:30
Vladislav Vaintroub
db9c2d225e fix typo 2024-06-05 14:28:50 +02:00
Vladislav Vaintroub
bfd3f45e8e Appveyor - better filtering for branches to match buildbot 2024-06-05 12:26:46 +02:00
Vladislav Vaintroub
b242b44f0a Appveyor build - skip irrelevant commits
Since we're only building on Windows, skip changes to debian directory
and to shell scripts.
2024-06-05 12:13:33 +02:00
Vladislav Vaintroub
40abd973ab MDEV-34236 Mroonga build with ASAN/UBSAN with GCC 12+ extremely slow.
Workaround by disabling sanitizer for single source file.
2024-06-05 11:58:53 +02:00
Tuukka Pasanen
b204817986 MDEV-34261: Detect if build is running under 32-bit container
When building on 64-bit kernel machine in 32-bit docker container
CMake falsely (but it works as expected) detects that container
runtime in also 64-bits. Use linux32 command to change runtime
enviroment to 32-bit and then CMake will correctly disable for
example ColumnStore and not try to build it

This commit only works with debian/autobake-debs.sh
2024-06-05 09:01:32 +01:00
Monty
38cbef8b3f MDEV-22935 Erroneous Aria Index / Optimizer behaviour
The problem was in the Aria part of the range optimizer,
maria_records_in_range(), which wrong concluded that there was no rows
in the range.

This error would happen in the unlikely case when searching for a range
on a partial key and there was a match for the first key part in the
upper part of the b-tree (node) and also a match in the underlying
node page.

In other words, for this bug to happen one have to use Aria, have a multi
part key with a lot of identical values for the first key part and do a
range search on the second part of the key.

Fixed by ensuring that we do not stop searching for partial keys found
on node.

Other things:
- Added some comments
- Changed a variable name to more clearly explain it's purpose.
- Fixed wrong cast in _ma_record_pos() that could cause problems on 32 bit
  systems.
2024-06-05 10:29:49 +03:00
Marko Mäkelä
c6d36c3e7c MDEV-34297 get_rnd_value() of ib_counter_t is unnecessarily complex
The shared counter template ib_counter_t uses the function
my_timer_cycles() as a source of pseudo-random numbers to pick a shard.
On some platforms, my_timer_cycles() could return the constant value 0.

get_rnd_value(): Remove.

my_pseudo_random(): Implement as an alias of my_timer_cycles() or
a wrapper for pthread_self().

Reviewed by: Vladislav Vaintroub
2024-06-05 09:54:14 +03:00
ilyasa1211
ecf4a26107 Fix Indonesian month name.
Noticed on MySQL: https://github.com/mysql/mysql-server/pull/531

Matches https://icu4c-demos.unicode.org/icu-bin/locexp?d_=en&_=in_IN.
2024-06-05 14:06:16 +10:00
Igor Babaev
4d38267fc7 MDEV-29307 Wrong result when joining two derived tables over the same view
This bug could affect queries containing a join of derived tables over
grouping views such that one of the derived tables contains a window
function while another uses view V with dependent subquery DSQ containing
a set function aggregated outside of the subquery in the view V. The
subquery also refers to the fields from the group clause of the view.Due to
this bug execution of such queries could produce wrong result sets.

When the fix_fields() method performs context analysis of a set function AF
first, at the very beginning the function Item_sum::init_sum_func_check()
is called. The function copies the pointer to the embedding set function,
if any, stored in THD::LEX::in_sum_func into the corresponding field of the
set function AF simultaneously changing the value of THD::LEX::in_sum_func
to point to AF. When at the very end of the fix_fields() method the function
Item_sum::check_sum_func() is called it is supposed to restore the value
of THD::LEX::in_sum_func to point to the embedding set function. And in
fact Item_sum::check_sum_func() did it, but only for regular set functions,
not for those used in window functions. As a result after the context
analysis of AF had finished THD::LEX::in_sum_func still pointed to AF.
It confused the further context analysis. In particular it led to wrong
resolution of Item_outer_ref objects in the fix_inner_refs() function.
This wrong resolution forced reading the values of grouping fields referred
in DSQ not from the temporary table used for aggregation from which they
were supposed to be read, but from the table used as the source table for
aggregation.

This patch guarantees that the value of THD::LEX::in_sum_func is properly
restored after the call of fix_fields() for any set function.
2024-06-04 17:54:01 -07:00
Yuchen Pei
042a0d85ad
MDEV-27186 spider/partition: Report error on info() failure
Like MDEV-28105, spider may attempt to connect to remote server in
info(), and it may emit an error upon failure to connect. In this
case, the downstream caller ha_partition::open() should return the
error to avoid inconsistency.

This fixes MDEV-27186, MDEV-27237, MDEV-27334, MDEV-28241, MDEV-34101.
2024-06-05 10:13:30 +10:00
Tuukka Pasanen
e9f4b87e53 MDEV-33919: Remove less standard format directive an-trap
Few man pages have less standard format directive:
.it 1 an-trap which specifying a formatting instruction
related to indentation (adds tab in man page in this)

There is no traces what an-trap should do and removing
it does not affect rendering of man page
2024-06-05 09:53:52 +10:00
Alexander Barkov
5e12d49205 MDEV-34295 CAST(char_col AS DOUBLE) prints redundant spaces in a warning
Field_string::val_int(), Field_string::val_real(), Field_string::val_decimal()
passed the whole buffer of field_length bytes to data type conversion routines.
This made conversion routines to print redundant trailing spaces in case of warnings.

Adding a method Field_string::to_lex_cstring() and using it inside
val_int(), val_real(), val_decimal(), val_str().

After this change conversion routines get the same value with what val_str() returns,
and no redundant trailing spaces are displayed.
2024-06-04 15:34:14 +04:00
Yuchen Pei
581712b989
MDEV-33490 MENT-1504 Fix some english strings in spider. 2024-06-04 12:25:08 +10:00
Thirunarayanan Balathandayuthapani
58a0e1e3dd MDEV-34223 Innodb - add status variable for number of bulk inserts
- Added a counter innodb_num_bulk_insert_operation in
INFORMATION_SCHEMA.GLOBAL_STATUS. This counter is incremented
whenever a InnoDB undergoes bulk insert operation.

- Change the innodb_instant_alter_column to atomic variable.
2024-06-03 16:27:22 +05:30
Julius Goryavsky
c21aa486a8 MDEV-32633: additional post-merge changes for 10.5+ 2024-06-03 09:48:13 +02:00
Denis Protivensky
a4838721a2 MDEV-32633: Fix Galera cluster <-> native replication interaction
GTID events are applied without a running server transaction,
we need to set next transaction ID for Wsrep transaction.

The whole Galera cluster now has a single GTID value (including
the server ID throughout the cluster), fix the config accordingly.

Add force restart so that repeated MTR test execution prints
consistent GTID values, otherwise they would have been recovered
from the previous run.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Denis Protivensky
0cc9b49751 MDEV-32633: Fix Galera cluster <-> native replication interaction
It's possible to establish Galera multi-cluster setups connected
through the native replication when every Galera cluster is configured
to have a separate domain ID.
For this setup to work, we need to replace domain ID values in generated
GTID events when they are written at transaction commit to the values
configured by Wsrep replication.

At the same time, it's possible that the GTID event already contains
a correct domain ID if it comes through the native replication from
another Galera cluster.
In this case, when such an event is applied either through a native
replication slave thread or through Wsrep applier, we write GTID event
on transaction start and avoid writing it during transaction commit.

The code contained multiple problems that were fixed:
- applying GTID events didn't work because it's applied without a
running server transaction and Wsrep transaction was not started
- GTID event generation on transaction start didn't contain proper
"standalone" and "is_transactional" flags that the original applied
GTID event contained
- condition determining that GTID event is written on transaction start
to avoid writing it on commit relied on the fact that the GTID event
is the first found in transaction/statement caches, which wasn't the
case and resulted in duplicate GTID events written
- instead of relying on the caches to find a GTID event, a simple check
is introduced that follows the exact rules for checking if event is
written at transaction start as described above
- the test case is improved to check that exact GTID events are
applied after two Galera clusters have synced.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Denis Protivensky
a6b7203d65 MDEV-33952: Fix flaky galera_create_table_as_select test with debug sync
The test that triggers multi-master conflict between two CTAS commands
uses LOCK/UNLOCK TABLES to block local CTAS from progress. It could
result in a race when UNLOCK TABLES command is issued a bit earlier
then needed, causing local CTAS to run further and change wsrep
transaction state, so that a different code path is taken later and
the original error gets overridden, causing the test to fail.
The solution is to replace LOCK/UNLOCK TABLES with debug sync points.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Yuchen Pei
f2302a62e3
Merge branch '10.5' into 10.6 2024-05-31 09:10:17 +10:00
Yuchen Pei
25476ba1ae
MDEV-29027 ASAN errors in spider_db_free_result after partition DDL
Spider calls ha_spider::close() at least twice on ALTER TABLE ... ADD
PARTITION. The first call frees wide_handler and the second call
accesses wide_handler->trx->thd (heap-use-after-free).

In general, there seems to be no problem with using THD obtained by
the macro current_thd() except in background threads. Thus, we simply
replace wide_handler->trx->thd with current_thd().

Original author: Nayuta Yanagasawa
2024-05-31 09:06:55 +10:00
Nayuta Yanagisawa
6d0c9872d9
MDEV-28522 Delete constant SPIDER_SQL_TYPE_*_HS
The HandlerSocket support of Spider has been deleted by MDEV-26858.
Thus, the constants, SPIDER_SQL_TYPE_*_HS, are no longer necessary.
2024-05-31 09:06:55 +10:00
Yuchen Pei
6c30220780
MDEV-26858 Spider: Remove dead code related to HandlerSocket
Remove the dead-code, in Spider, which is related to the Spider's
HandlerSocket support. The code has been disabled for a long time
and it is unlikely that the code will be enabled.

- rm all files under storage/spider/hs_client/ except hs_compat.h
- rm storage/spider/spd_db_handlersocket.*
- unifdef -UHS_HAS_SQLCOM -UHAVE_HANDLERSOCKET \
  -m storage/spider/spd_* storage/spider/ha_spider.* storage/spider/hs_client/*
- remove relevant files from storage/spider/CMakeLists.txt
2024-05-31 09:06:55 +10:00
Marko Mäkelä
5ba542e9ee Merge 10.5 into 10.6 2024-05-30 14:27:07 +03:00
Marko Mäkelä
0c440abd5e MDEV-31340 fixup: Add end-of-test marker 2024-05-30 14:23:45 +03:00
Marko Mäkelä
c71275b69e Fix ./mtr --repeat=2 main.func_str 2024-05-30 14:22:00 +03:00
Dave Gosselin
b0b463a894 MDEV-33616 Fix memleak in pfs_noop
Invoke cleanup routine at the end of pfs_noop.
2024-05-29 16:49:51 -04:00
Andrew Hutchings
1929a698a3 Update README for branch choice
This commit updates the README to indicate that the "Get the code, build
it, test it" link will help decide the correct branch to work in.

Also fixes a grammar issue and cleans-up the Markdown a little bit.
2024-05-29 13:49:32 +01:00
Souradeep Saha
83a04be84a Fix Various Typos
Fix various typos, in comments and DEBUG statements, and code changes
are non-functional.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2024-05-28 11:31:49 +10:00
Sergei Petrunia
36ab6cc80c MDEV-34125: ANALYZE FORMAT=JSON: r_engine_stats.pages_read_time_ms has wrong scale
- Change the comments in class ha_handler_stats to say the members
  are in ticks, not milliseconds.
- In sql_explain.cc, adjust the scale to print milliseconds.
2024-05-27 15:28:57 +03:00
Alexander Barkov
4a158ec167 MDEV-34226 On startup: UBSAN: applying zero offset to null pointer in my_copy_fix_mb from strings/ctype-mb.c and other locations
nullptr+0 is an UB (undefined behavior).

- Fixing my_string_metadata_get_mb() to handle {nullptr,0} without UB.
- Fixing THD::copy_with_error() to disallow {nullptr,0} by DBUG_ASSERT().
- Fixing parse_client_handshake_packet() to call THD::copy_with_error()
  with an empty string {"",0} instead of NULL string {nullptr,0}.
2024-05-27 13:19:13 +04:00
Alexander Barkov
7925326183 MDEV-30931 UBSAN: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in get_interval_value on SELECT
- Fixing the code in get_interval_value() to use Longlong_hybrid_null.
  This allows to handle correctly:

  - Signed and unsigned arguments
    (the old code assumed the argument to be signed)
  - Avoid undefined negation behavior the corner case with LONGLONG_MIN

  This fixes the UBSAN warning:
    negation of -9223372036854775808 cannot be represented
    in type 'long long int';

- Fixing the code in get_interval_value() to avoid overflow in
  the INTERVAL_QUARTER and INTERVAL_WEEK branches.
  This fixes the UBSAN warning:
    signed integer overflow: -9223372036854775808 * 7 cannot be represented
    in type 'long long int'

- Fixing the INTERVAL_WEEK branch in date_add_interval() to handle
  huge numbers correctly. Before the change, huge positive numeber
  were treated as their negative complements.
  Note, some other branches still can be affected by this problem
  and should also be fixed eventually.
2024-05-27 13:19:13 +04:00
Thirunarayanan Balathandayuthapani
44b23bb184 MDEV-34222 Alter operation on redundant table aborts the server
- InnoDB page compression works only on COMPACT or DYNAMIC row
format tables. So InnoDB should throw error when alter table
tries to enable PAGE_COMPRESSED for redundant table.
2024-05-24 15:48:19 +05:30
Thirunarayanan Balathandayuthapani
0ffa340a49 MDEV-34221 Errors about checksum mismatch on crash recovery are confusing
- InnoDB should avoid printing the error message before
restoring the first page from doublewrite buffer.
2024-05-24 12:57:42 +05:30
Vladislav Vaintroub
736449d30f MDEV-34205: ASAN stack buffer overflow in strxnmov() in frm_file_exists
Correct the second parameter for strxnmov to prevent potential buffer
overflows. The second parameter must be one less than the size of the
input buffer to avoid writing past the end of the buffer.

While the second parameter is usually correct, there are exceptions
that need fixing.

This commit addresses the issue within frm_file_exists() and other
affected places.
2024-05-23 22:08:27 +02:00
Alexander Barkov
7c4c082349 MDEV-28387 UBSAN: runtime error: negation of -9223372036854775808 cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strtoll10 on SELECT
Fixing the condition to raise an overflow in the ulonglong
representation of the number is greater or equal to 0x8000000000000000ULL.
Before this change the condition did not catch -9223372036854775808
(the smallest possible signed negative longlong number).
2024-05-23 14:18:34 +04:00
Yuchen Pei
c4020b541c
MDEV-24610 MEMORY SE: check overflow in info calls with HA_STATUS_AUTO 2024-05-22 09:18:09 +10:00