Commit graph

621 commits

Author SHA1 Message Date
Marko Mäkelä
15700f54c2 Merge 11.4 into 11.7 2025-01-09 09:41:38 +02:00
Marko Mäkelä
17f01186f5 Merge 10.11 into 11.4 2025-01-09 07:58:08 +02:00
Marko Mäkelä
3f914afd3a Merge 10.6 into 10.11 2025-01-02 12:39:56 +02:00
Julius Goryavsky
3cd9f9d1b3 Merge branch '10.5' into '10.6' 2024-12-18 05:09:23 +01:00
Daniele Sciascia
75dd0246f8 Remove error handling from wsrep_sync_wait()
Let the wsrep-lib error be set/overriden at the end of
dispatch_command().

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-17 09:52:32 +01:00
Kristian Nielsen
0f47db8525 Merge 10.11 -> 11.4
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 11:01:42 +01:00
Kristian Nielsen
e7c6cdd842 Merge 10.6 -> 10.11
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 10:11:58 +01:00
Kristian Nielsen
0166c89e02 Merge 10.5 -> 10.6
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-12-05 09:20:36 +01:00
Teemu Ollakka
a2575a0703 MDEV-35465 Async replication stops working on Galera async replica node when parallel replication is enabled
Parallel slave failed to retry in retry_event_group() with error

    WSREP: Parallel slave worker failed at wsrep_before_command() hook

Fix wsrep transaction cleanup/restart in retry_event_group() to properly
clean up previous transaction by calling wsrep_after_statement().
Also move call to reset error after call to wsrep_after_statement()
to make sure that it remains effective.

Add a MTR test galera_as_slave_parallel_retry to reproduce the error
when the fix is not present.

Other issues which were detected when testing with sysbench:

Check if parallel slave is killed for retry before waiting for prior
commits in THD::wsrep_parallel_slave_wait_for_prior_commit(). This
is required with slave-parallel-mode=optimistic to avoid deadlock
when a slave later in commit order manages to reach prepare phase
before a lock conflict is detected.

Suppress wsrep applier specific warning for slave threads.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-12-03 15:05:32 +01:00
Marko Mäkelä
33907f9ec6 Merge 11.4 into 11.7 2024-12-02 17:51:17 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Jan Lindström
f5aed74573 MDEV-35486 : MDEV-33997 test failed
Problem was that at wsrep_to_isolation_end saved_lock_wait_timeout
variable was set to thd->variables.lock_wait_timeout when RSU
is used and variable value was 0 leading sporadic lock wait timeout
errors. Fixed by removing incorrect variable set.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-11-27 13:00:08 +01:00
Oleksandr Byelkin
b12ff287ec Merge branch '11.6' into 11.7 2024-11-10 19:22:21 +01:00
Thirunarayanan Balathandayuthapani
074831ec61 Merge branch 10.5 into 10.6 2024-11-08 18:17:15 +05:30
Oleksandr Byelkin
9e1fb104a3 MariaDB 11.4.4 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmck77AACgkQ8WVvJMdM
 0dgccQ/+Lls8fWt4D+gMPP7x+drJSO/IE/gZFt3ugbWF+/p3B2xXAs5AAE83wxEh
 QSbp4DCkb/9PnuakhLmzg0lFbxMUlh4rsJ1YyiuLB2J+YgKbAc36eQQf+rtYSipd
 DT5uRk36c9wOcOXo/mMv4APEvpPXBIBdIL4VvpKFbIOE7xT24Sp767zWXdXqrB1f
 JgOQdM2ct+bvSPC55oZ5p1kqyxwvd6K6+3RB3CIpwW9zrVSLg7enT3maLjj/761s
 jvlRae+Cv+r+Hit9XpmEH6n2FYVgIJ3o3WhdAHwN0kxKabXYTg7OCB7QxDZiUHI9
 C/5goKmKaPB1PCQyuTQyLSyyK9a8nPfgn6tqw/p/ZKDQhKT9sWJv/5bSWecrVndx
 LLYifSTrFC/eXLzgPvCnNv/U8SjsZaAdMIKS681+qDJ0P5abghUIlGnMYTjYXuX1
 1B6Vrr0bdrQ3V1CLB3tpkRjpUvicrsabtuAUAP65QnEG2G9UJXklOer+DE291Gsl
 f1I0o6C1zVGAOkUUD3QEYaHD8w7hlvyfKme5oXKUm3DOjaAar5UUKLdr6prxRZL4
 ebhmGEy42Mf8fBYoeohIxmxgvv6h2Xd9xCukgPp8hFpqJGw8abg7JNZTTKH4h2IY
 J51RpD10h4eoi6WRn3opEcjexTGvZ+xNR7yYO5WxWw6VIre9IUA=
 =s+WW
 -----END PGP SIGNATURE-----

Merge tag '11.4' into 11.6

MariaDB 11.4.4 release
2024-11-08 07:17:00 +01:00
Jan Lindström
e4a3a11dcc MDEV-35344 : Galera test failure on galera_sync_wait_upto
Ignoring configured server_id should not be a warning because
correct configuration is documented. Changed message to info
level with more detailed message what was configured and
what will be actually used.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-11-06 04:59:10 +01:00
Denis Protivensky
6d5fe9ed0d MDEV-28378: Don't hang trying to peek log event past the end of log
While applying CTAS log event, we peek the relay log to see if CTAS
contains inserted rows or if it's empty.
The peek function didn't check for end-of-file condition when tried to
get the next event from the log, and thus it hanged.

The fix includes checking for end-of-file while peeking for log events
and considering returned XID_EVENT value as a sign of an empty CTAS.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-11-06 04:59:09 +01:00
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleksandr Byelkin
69d033d165 Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Jan Lindström
b3be3c2157 MDEV-30653 : With wsrep_mode=REPLICATE_ARIA only part of mixed-engine transactions is replicated
Replication of non-transactional engines is experimental and
uses TOI. This naturally means that if there is open transaction
with transactional engine it's changes will be rolled back.

Fixed by adding error message if non-transactional engine
is part of multi-engine transaction with warning.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-10-23 04:00:52 +02:00
Alexander Barkov
e1cd3c4033 MDEV-12252 ROW data type for stored function return values
Adding support for the ROW data type in the stored function RETURNS clause:

- explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE

  CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...

- anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...

- anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...

Adding support for anchored scalar data types in RETURNS clause:

- "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;

- "[db1.]table1.column1" for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;

Details:

- Adding a new sql_mode_t parameter to
    sp_head::create()
    sp_head::sp_head()
    sp_package::create()
    sp_package::sp_package()
  to guarantee early initialization of sp_head::m_sql_mode.
  Before this change, this member was not initialized at all during
  CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
  Now it needs to be initialized to write properly the
  mysql.proc.returns column, according to the create time sql_mode.

- Code refactoring to make the things simpler and functions smaller:

  * Adding a new method
    Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit definition.

  * Adding a new method
    Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit or a table anchored definition.

  * Adding a new method
    Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
    to create and array of Item_field corresponding to all Field instances
    in a Virtual_tmp_table

  * Removing Item_field_row::row_create_items(). It was decomposed
    into the new methods described above.

  * Moving the code from the loop body in sp_rcontext::init_var_items()
    into a separate method Spvar_definition::make_item_field_row(),
    to make the code clearer (smaller functions).
    make_item_field_row() itself uses the new methods described above.

- Changing the data type of sp_head::m_return_field_def
  from Column_definition to Spvar_definition.
  So now it supports not only SQL column field types,
  but also explicit ROW and anchored ROW data types,
  as well as anchored column types.

- Adding a new Column_definition parameter to sp_head::create_result_field().
  Before this patch, create_result_field() took the definition only
  from m_return_field_def. Now it's also called with a local Column_definition
  variable which contains the explicit definition resolved from an
  anchored defition.

- Modifying sql_yacc.yy to support the new grammar.
  Adding new helper methods:
    * sf_return_fill_definition_row()
    * sf_return_fill_definition_rowtype_of()
    * sf_return_fill_definition_type_of()

- Fixing tests in:
  * Virtual_tmp_table::setup_field_pointers() in sql_select.cc
  * Send_field::normalize() in field.h
  * store_column_type()
  to prevent calling Type_handler_row::field_type(),
  which is implemented a DBUG_ASSERT(0).
  Before this patch the affected methods and functions were called only
  for scalar data types. Now ROW is also possible.

- Adding a new virtual method Field::cols()

- Overriding methods:
   Item_func_sp::cols()
   Item_func_sp::element_index()
   Item_func_sp::check_cols()
   Item_func_sp::bring_value()
  to support the ROW data type.

- Extending the rule sp_return_type to support
  * explicit ROW and anchored ROW data types
  * anchored scalar data types

- Overriding Field_row::sql_type() to print
  the data type of an explicit ROW.
2024-10-21 07:59:29 +04:00
Monty
bddbef3573 MDEV-34533 asan error about stack overflow when writing record in Aria
The problem was that when using clang + asan, we do not get a correct value
for the thread stack as some local variables are not allocated at the
normal stack.

It looks like that for example clang 18.1.3, when compiling with
-O2 -fsanitize=addressan it puts local variables and things allocated by
alloca() in other areas than on the stack.

The following code shows the issue

Thread 6 "mariadbd" hit Breakpoint 3, do_handle_one_connection
    (connect=0x5080000027b8,
    put_in_cache=<optimized out>) at sql/sql_connect.cc:1399

THD *thd;
1399      thd->thread_stack= (char*) &thd;
(gdb) p &thd
(THD **) 0x7fffedee7060
(gdb) p $sp
(void *) 0x7fffef4e7bc0

The address of thd is 24M away from the stack pointer

(gdb) info reg
...
rsp            0x7fffef4e7bc0      0x7fffef4e7bc0
...
r13            0x7fffedee7060      140737185214560

r13 is pointing to the address of the thd. Probably some kind of
"local stack" used by the sanitizer

I have verified this with gdb on a recursive call that calls alloca()
in a loop. In this case all objects was stored in a local heap,
not on the stack.

To solve this issue in a portable way, I have added two functions:

my_get_stack_pointer() returns the address of the current stack pointer.
The code is using asm instructions for intel 32/64 bit, powerpc,
arm 32/64 bit and sparc 32/64 bit.
Supported compilers are gcc, clang and MSVC.
For MSVC 64 bit we are using _AddressOfReturnAddress()

As a fallback for other compilers/arch we use the address of a local
variable.

my_get_stack_bounds() that will return the address of the base stack
and stack size using pthread_attr_getstack() or NtCurrentTed() with
fallback to using the address of a local variable and user provided
stack size.

Server changes are:

- Moving setting of thread_stack to THD::store_globals() using
  my_get_stack_bounds().
- Removing setting of thd->thread_stack, except in functions that
  allocates a lot on the stack before calling store_globals().  When
  using estimates for stack start, we reduce stack_size with
  MY_STACK_SAFE_MARGIN (8192) to take into account the stack used
  before calling store_globals().

I also added a unittest, stack_allocation-t, to verify the new code.

Reviewed-by: Sergei Golubchik <serg@mariadb.org>
2024-10-16 17:24:46 +03:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
b53b81e937 Merge 11.2 into 11.4 2024-10-03 14:32:14 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Yuchen Pei
ba7088d462
Merge '11.4' into 11.6 2024-10-03 15:59:20 +10:00
Denis Protivensky
9f61aa4f8a MDEV-34822 pre-fix: Make wsrep_ready flag read lock-free
It's read for every command execution, and during slave replication
for every applied event.

It's also planned to be used during write set applying, so it means
mostly every server thread is going to compete for the mutex covering
this variable, especially considering how rarely it changes.
Converting wsrep_ready to atomic relaxes the things.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-26 00:04:56 +02:00
Marko Mäkelä
762ad01c7f Merge 11.2 into 11.4 2024-09-13 13:09:23 +03:00
Marko Mäkelä
f0de610d0c Merge 10.11 into 11.2 2024-09-10 18:35:16 +03:00
Sergei Petrunia
2c3b298337 Merge 11.2 into 11.4 2024-09-09 14:40:02 +03:00
Sergei Petrunia
abd98336d2 Merge 10.11 -> 11.2 2024-09-09 13:50:38 +03:00
Marko Mäkelä
f9f92b480e Merge 10.6 into 10.11 2024-09-06 16:17:42 +03:00
Denis Protivensky
4e2c02a12c MDEV-33133: MDL conflict handling code should skip BF-aborted trxs
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.

The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-03 07:45:57 +02:00
Julius Goryavsky
d5a669b6b6 Merge branch '10.5' into '10.6' 2024-09-03 07:44:51 +02:00
Julius Goryavsky
d058be62b8 Merge branch '10.6' into '10.11' 2024-09-02 03:49:03 +02:00
Denis Protivensky
235f33e360 MDEV-33133: MDL conflict handling code should skip BF-aborted trxs
It's possible that MDL conflict handling code is called more
than once for a transaction when:
- it holds more than one conflicting MDL lock
- reschedule_waiters() is executed,
which results in repeated attempts to BF-abort already aborted
transaction.
In such situations, it might be that BF-aborting logic sees
a partially rolled back transaction and erroneously decides
on future actions for such a transaction.

The specific situation tested and fixed is when a SR transaction
applied in the node gets BF-aborted by a started TOI operation.
It's then caught with the server transaction already rolled back,
but with no MDL locks yet released. This caused wrong state
detection for such a transaction during repeated MDL conflict
handling code execution.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 16:19:59 +02:00
Julius Goryavsky
bac0804d81 Merge branch '10.5' into '10.6' 2024-09-01 06:51:25 +02:00
Alexey Yurchenko
731a5aba0b Use only MySQL code for TOI error vote
For TOI events specifically we have a situation where in case of the
same error different nodes may generate different messages. This may
be for two reasons:
 - different locale setting between the current client session and
   server default (we can reasonably require server locales to be
   identical on all nodes, but user can change message locale for the
   session)
 - non-deterministic course of STATEMENT execution e.g. for ALTER TABLE

On the other hand we may reasonably expect TOI event failures since
they are executed after replication, so we must ensure that voting is
consistent. For that purpose error codes should be sufficiently unique
and deterministic for TOI event failures as DDLs normally deal with
a single object, so we can merely use MySQL error codes to vote on.

Notice that this problem does not happen with regular transactional
writesets, since the originator node will always vote success and
replica nodes are assumed to have the same global locale setting.
As such different error messages indicate different errors even if
the error code is the same (e.g. ER_DUP_KEY can happen on different
rows tables).

Use only MySQL error code (without the error message) for error voting
in case of TOI event failure.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-01 02:58:27 +02:00
Alexey Yurchenko
a1e5a284fc MDEV-31809 Automatic SST user account management
Implement automatic creation of temporary accounts for SST and pass
account credentials to SST script via socket as opposed to environment
variables. Delete the user after the SST script returns,

Respect wsrep_sst_auth set by the adminitrator in case some additional
privilege grants are needed for particular SST method.

mysqldump SST requires significant change to make use of the new
automatic user generation facility. For now just make it compatible
by ignoring automatically generated user and rely only on wsrep_sst_auth
setting on the joiner node to keep backward compatibility.

Adapt mysqldump SST to automatic SST user generation changes:
 - disable special treatment for mysqldump SST on donor
 - make mysqldump SST script compatible with the new SST script
   interface.

Differentiate user privileges for different SST methods:
 - grant minimum required privileges for clone and xtrabackup SST
   accounts
 - grant all privileges to custom SST accounts as it is not known what
   is needed.
 - disable SST account generation for rsync SST since it is not needed.

MTR tests:
 - add MTR tests for clone and xtrabackup SSTs without wsrep_sst_auth,
 - add MTR test for testing masking of wsrep_sst_auth.
 - don't attmept to restore original wsrep_sst_auth in MTR tests as it
   is always masked.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-10 23:29:05 +02:00
Julius Goryavsky
c21aa486a8 MDEV-32633: additional post-merge changes for 10.5+ 2024-06-03 09:48:13 +02:00
Denis Protivensky
0cc9b49751 MDEV-32633: Fix Galera cluster <-> native replication interaction
It's possible to establish Galera multi-cluster setups connected
through the native replication when every Galera cluster is configured
to have a separate domain ID.
For this setup to work, we need to replace domain ID values in generated
GTID events when they are written at transaction commit to the values
configured by Wsrep replication.

At the same time, it's possible that the GTID event already contains
a correct domain ID if it comes through the native replication from
another Galera cluster.
In this case, when such an event is applied either through a native
replication slave thread or through Wsrep applier, we write GTID event
on transaction start and avoid writing it during transaction commit.

The code contained multiple problems that were fixed:
- applying GTID events didn't work because it's applied without a
running server transaction and Wsrep transaction was not started
- GTID event generation on transaction start didn't contain proper
"standalone" and "is_transactional" flags that the original applied
GTID event contained
- condition determining that GTID event is written on transaction start
to avoid writing it on commit relied on the fact that the GTID event
is the first found in transaction/statement caches, which wasn't the
case and resulted in duplicate GTID events written
- instead of relying on the caches to find a GTID event, a simple check
is introduced that follows the exact rules for checking if event is
written at transaction start as described above
- the test case is improved to check that exact GTID events are
applied after two Galera clusters have synced.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-03 09:48:13 +02:00
Sergei Golubchik
201e28fac1 cleanup: redundant condition 2024-05-27 12:39:02 +02:00
Monty
ab513b007b Optimize checking if a table is a statistics table 2024-05-27 12:39:02 +02:00
Oleksandr Byelkin
dd7d9d7fb1 Merge branch '11.4' into 11.5 2024-05-23 17:01:43 +02:00
Oleksandr Byelkin
99b370e023 Merge branch '11.2' into 11.4 2024-05-21 19:38:51 +02:00