Commit graph

201295 commits

Author SHA1 Message Date
Alexander Barkov
6cecf61a59 MDEV-34417 Wrong result set with utf8mb4_danish_ci and BNLH join
There were erroneous calls for charpos() in key_hashnr() and key_buf_cmp().
These functions are never called with prefix segments.

The charpos() calls were wrong. Before the change BNHL joins
- could return wrong result sets, as reported in MDEV-34417
- were extremely slow for multi-byte character sets, because
  the hash was calculated on string prefixes, which increased
  the amount of collisions drastically.

This patch fixes the wrong result set as reported in MDEV-34417,
as well as (partially) the performance problem reported in MDEV-34352.
2024-06-20 11:30:02 +04:00
Daniel Black
40eb56b609 MDEV-34430: remove Debian character-set-collations
Since MDEV-25829 Change default Unicode collation to uca1400_ai_ci
there is no need to set character-set-collations explicitly as its
the default.

Further mode the change in defaults affects all character sets that
support the uca1400_ai_ci collation.
2024-06-20 17:17:18 +10:00
Monty
279aa1e6b4 Disable new connections in case of fatal signal
A user reported that MariaDB server got a signal 6 but still accepted new
connections and did not crash.  I have not been able to find a way to
repeat this or find the cause of issue. However to make it easier to
notice that the server is unstable, I added code to disable new
connections when the handle_fatal_signal() handler has been called.
2024-06-20 09:53:01 +03:00
Monty
3541bd63f0 MDEV-33582 Add more warnings to be able to better diagnose network issues
Changed the logged messages from errors to warnings
Also changed 'remain' to 'read_length' in the warning to make it more readable.
2024-06-20 09:53:01 +03:00
Vladislav Vaintroub
6c2cd4cf56 MDEV-34428 bootstrap can't delete tempfile, it is already gone
The problem is seen on CI, where TEMP pointed to directory outside of
the usual vardir, when testing mysql_install_db.exe
A likely cause for this error is that TEMP was periodically cleaned up
by some automation running on the host, perhaps by buildbot itself.

To fix, mysql_install_db.exe will now use datadir as --tmpdir
for the bootstrap run. This will minimize chances to run into any
environment problems.
2024-06-19 22:16:02 +02:00
Vicențiu Ciorbaru
6382339144 MDEV-34311: Alter USER should reset all account limit counters
This commit introduces a reset of password errors counter on any alter user
command for the altered user. This is done so as to not require a
complete privilege system reload.
2024-06-19 23:08:35 +03:00
Vicențiu Ciorbaru
2d8d813941 cleanup, refactor
Fix coding style and extract common password reset counter code into
separate ACL_USER method.
2024-06-19 23:08:35 +03:00
Iaroslav Babanin
5d49a2add7 MDEV-33935 fix deadlock counter
- The deadlock counter was moved from
Deadlock::find_cycle into Deadlock::report, because
the find_cycle method is called multiple times during deadlock
detection flow, which means it shouldn't have such side effects.
But report() can, which called only once for
a victim transaction.
- Also the deadlock_detect.test and *.result test case
has been extended to handle the fix.
2024-06-19 20:43:33 +03:00
Brandon Nesterenko
8c8b3ab784 MDEV-34274: Test rpl.rpl_change_master_demote frequently fails on buildbot with "IO thread should not be running..."
The test rpl.rpl_change_master_demote used a `sleep 1` command
to give time for a START SLAVE UNTIL to start the slave threads
and wait for them to automatically die by UNTIL.  On machines
with heavy load (especially MSAN bb builders), one second was
not enough, and the test would fail due to the IO thread
still being up.

This patch fixes the test by replacing the sleep with specific
conditions to wait for. The test cannot wait for the IO or SQL
threads to start, as it would be possible that they would be
started and stopped by the time the MTR executor would check
the slave status. So instead, we test for proof that they
existed via the Connections status variable being incremented
by at least 2 (Connections just shows the global thread id).
At this point, we still can't use the wait_for_slave_to_stop
helper, as the SQL/IO_Running fields of SHOW SLAVE STATUS
may not be updated yet. So instead, we use
information_schema.processlist, which would show the presence
of the Slave_SQL/IO threads. So to "wait for the slave to stop",
we wait for the Slave_SQL/IO threads to be gone from the
processlist.
2024-06-19 10:17:08 -06:00
Jan Lindström
ee974ca5e0 MDEV-31658 : Deadlock found when trying to get lock during applying
Problem was that there was two non-conflicting local idle
transactions in node_1 that both inserted a key to primary key.
Then two transactions from other nodes inserted also
a key to primary key so that insert from node_2 conflicted
one of the local transactions in node_1 so that there would
be duplicate key if both are committed. For this insert
from other node tries to acquire S-lock for this record
and because this insert is high priority brute force (BF)
transaction it will kill idle local transaction.

Concurrently, second insert from node_3 conflicts the second
idle insert transaction in node_1. Again, it tries to acquire
S-lock for this record and kills idle local transaction.

At this point we have two non-conflicting high priority
transactions holding S-lock on different records in node_1.
For example like this: rec s-lock-node2-rec s-lock-node3-rec rec.

Because these high priority BF-transactions do not wait
each other insert from node3 that has later seqno compared
to insert from node2 can continue. It will try to acquire
insert intention for record it tries to insert (to avoid
duplicate key to be inserted by local transaction). Hower,
it will note that there is conflicting S-lock in same gap
between records. This will lead deadlock error as we have
defined that BF-transactions may not wait for record lock
but we can't kill conflicting BF-transaction because
it has lower seqno and it should commit first.

BF-transactions are executed concurrently because their
values to primary key are different i.e. they do not
conflict.

Galera certification will make sure that inserts from
other nodes i.e these high priority BF-transactions
can't insert duplicate keys. Local transactions naturally
can but they will be killed when BF-transaction
acquires required record locks.

Therefore, we can allow situation where there is conflicting
S-lock and insert intention lock regardless of their seqno
order and let both continue with no wait. This will lead
to situation where we need to allow BF-transaction
to wait when lock_rec_has_to_wait_in_queue is called
because this function is also called from
lock_rec_queue_validate and because lock is waiting
there would be assertion in ut_a(lock->is_gap()
|| lock_rec_has_to_wait_in_queue(cell, lock));

lock_wait_wsrep_kill
  Add debug sync points for BF-transactions killing
  local transaction.

wsrep_assert_no_bf_bf_wait
  Print also requested lock information

lock_rec_has_to_wait
  Add function to handle wsrep transaction lock wait
  cases.

lock_rec_has_to_wait_wsrep
  New function to handle wsrep transaction lock wait
  exceptions.

lock_rec_has_to_wait_in_queue
  Remove wsrep exception, in this function all
  conflicting locks need to wait in queue.
  Conflicts between BF and local transactions
  are handled in lock_wait.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-19 14:09:11 +02:00
Marko Mäkelä
f9e717cb48 MDEV-34426: Assertion failure on bootstrap
Let us relax an assertion that would fail in
./mtr main.1st --mysqld=--innodb-undo-tablespaces=127
2024-06-19 15:08:19 +03:00
Julius Goryavsky
2f0e7f665c galera: syncing SST scripts code with the following versions 2024-06-19 14:07:34 +02:00
Jan Lindström
1001dae186 MDEV-12008 : Change error code for Galera unkillable threads
Changed error code for Galera unkillable threads to
be ER_KILL_DENIED_HIGH_PRIORITY giving message

This is a high priority thread/query and cannot be killed
without the compromising consistency of the cluster

also a warning is produced
  Thread %lld is [wsrep applier|high priority] and cannot be killed

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-06-19 14:07:34 +02:00
Marko Mäkelä
34813c1aa0 Merge 10.6 into 10.11 2024-06-19 15:04:07 +03:00
Andrei
387bdb2a2e MDEV-29934 rpl.rpl_start_alter_chain_basic, rpl.rpl_start_alter_restart_slave sometimes fail in BB with result content mismatch
rpl.rpl_start_alter_chain_basic was used to fail sporadically due
to a missed GTID master-slave synchronization which was necessary
because of the following SELECT from GTID-state table.

Fixed with arranging two synchronization pieces for two
chain slaves requiring that.

Note rpl.rpl_start_alter_restart_slave must have been fixed by
MDEV-30460 and 87e13722a9 (manual) merge commit.
2024-06-19 14:09:00 +03:00
Marko Mäkelä
5b26a07698 MDEV-34178: Enable spinloop for index_lock
In an I/O bound concurrent INSERT test conducted by Mark Callaghan,
spin loops on dict_index_t::lock turn out to be beneficial.

This is a mixed bag; enabling the spin loops will improve throughput
and latency on some workloads and degrade in others.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
Performance tested by: Axel Schwenke
2024-06-19 13:41:11 +03:00
Marko Mäkelä
f8d213bd0e MDEV-34178: Improve the spin loops
srw_mutex_impl<spinloop>::wait_and_lock(): Invoke srw_pause() and
reload the lock word on each loop. Thanks to Mark Callaghan for
suggesting this.

ssux_lock_impl<spinloop>::rd_wait(): Actually implement a spin loop
on the rw-lock component without blocking on the mutex component.
If there is a conflict with wr_lock(), wait for writer.lock to be
released without actually acquiring it.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
2024-06-19 13:40:57 +03:00
Marko Mäkelä
6cde03aedc MDEV-34178: Improve PERFORMANCE_SCHEMA instrumentation
When MariaDB is built with PERFORMANCE_SCHEMA support enabled
and with futex-based rw-locks (not srw_lock_), we were unnecessarily
releasing and reacquiring lock.writer in srw_lock_impl::psi_wr_lock()
and ssux_lock::psi_wr_lock().

If there is a conflict with rd_lock(), let us hold the lock.writer
and execute u_wr_upgrade() to wait for rd_unlock().

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
2024-06-19 13:30:23 +03:00
Alexander Barkov
cfa6143453 MDEV-27966 Assertion fixed()' failed and Assertion fixed == 1' failed, both in Item_func_concat::val_str on SELECT after INSERT with collation utf32_bin on utf8_bin table
This problem was earlier fixed by this commit:

> commit 08c7ab404f
> Author: Aleksey Midenkov <midenok@gmail.com>
> Date:   Mon Apr 18 12:44:27 2022 +0300
>
>    MDEV-24176 Server crashes after insert in the table with virtual
>    column generated using date_format() and if()

Adding an mtr test only.
2024-06-19 10:01:30 +04:00
Marko Mäkelä
2bd661ca10 MDEV-34178: Simplify the U lock
The U lock mode of the sux_lock that was introduced in
commit 03ca6495df (MDEV-24142)
is unnecessarily complex.

Internally, sux_lock comprises two parts, each with their own wait queue
inside the operating system kernel: a mutex and a rw-lock.

We can map the operations as follows:

x_lock(): (X,X)
u_lock(): (X,_)
s_lock(): (_,S)

The Update lock mode, which is mutually exclusive with itself and with
X (exclusive) locks but not with shared (S) locks, was unnecessarily
acquiring a shared lock on the second component. The mutual exclusion
is guaranteed by the first component.

We might simplify the #ifdef SUX_LOCK_GENERIC case further by omitting
srw_mutex_impl::lock, because it is kind-of duplicating the mutex
that we will use for having a wait queue. However, the predicate
buf_page_t::can_relocate() would depend on the predicate
is_locked_or_waiting(), which is not available for pthread_mutex_t.

Reviewed by: Debarun Banerjee
Tested by: Matthias Leich
2024-06-18 18:22:28 +03:00
Brandon Nesterenko
6cab2f75fe MDEV-23857: replication master password length
After MDEV-4013, the maximum length of replication passwords was extended to
96 ASCII characters. After a restart, however, slaves only read the first 41
characters of MASTER_PASSWORD from the master.info file. This lead to slaves
unable to reconnect to the master after a restart.

After a slave restart, if a master.info file is detected, use the full
allowable length of the password rather than 41 characters.

Reviewed By:
============
Sergei Golubchik <serg@mariadb.com>
2024-06-18 07:21:18 -06:00
Brandon Nesterenko
0e25cc51a9 MDEV-34397: "delete si" rather than "my_free(si)" in THD::register_slave()
In the error case of THD::register_slave(), there is undefined
behavior of Slave_info si because it is allocated via malloc()
(my_malloc), and cleaned up via delete().

This patch makes these consistent by switching si's cleanup
to use my_free.
2024-06-18 07:20:41 -06:00
Andre Alves
0dfc9ece48 MDEV-33618: add mariadbd_safe to option groups 2024-06-18 08:24:03 +01:00
Souradeep Saha
10fbd1ce51 MDEV-34168: Extend perror utility to print link to KB page
As all MariaDB Server errors now have a dedicated web page, the
perror utility is extended to include a link to the KB page of
the corresponding error code.

All new code of the whole pull request, including one or several
files that are either new files or modified ones, are contributed
under the BSD-new license. I am contributing on behalf of my
employer Amazon Web Services, Inc.
2024-06-18 13:25:39 +10:00
Andrei
c37b2a9f04 MDEV-30460 rpl.rpl_start_alter_restart_slave sometimes fails in BB with result length mismatch
The test was used to fail because of lacking a synchronization point
designating an expected for processing "CA_1" event has been indeed
taken into this phase.
The event apparently may not have even arrived at slave.

Fixed with deploying the missed synchronization.
2024-06-17 19:06:34 +03:00
Alexander Barkov
83d3ed4908 MDEV-34014 mysql_upgrade failed
Adding a new statement into scripts/sys_schema/before_setup.sql:

  ALTER DATABASE sys CHARACTER SET utf8mb3 COLLATE utf8mb3_general_ci;

to fix db.opt in case:
- the database `sys` was altered to unexpected CHARACTER SET or COLLATE values
- or db.opt was erroneously removed

to make sure that sys objects are always recreated using utf8mb3_general_ci.
2024-06-17 16:38:48 +04:00
Alexander Barkov
c4bf4ce948 Merge remote-tracking branch 'origin/11.2' into 11.4 2024-06-17 15:46:39 +04:00
Sergei Petrunia
2eda310b15 Restore test coverage for MDEV-18956
(It was accidentally removed by fix for MDEV-28846)
2024-06-17 14:08:32 +03:00
Sergei Petrunia
0903276eae MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters, followup
Review followup: RANGE_OPT_PARAM statement_should_be_aborted()
checks for thd->is_fatal_error and thd->is_error(). The first is
redundant when the second is present.
2024-06-17 14:08:32 +03:00
Sergei Petrunia
a2066b2400 MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters
The optimizer deals with Rowid Filters this way:

1. First, range optimizer is invoked. It saves information
   about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
   index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.

The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.

Fixed by checking if query was killed. Note: the problem also
affects 10.6, even if error handling for
SQL_SELECT::test_quick_select is different there.
2024-06-17 14:08:32 +03:00
Sergei Petrunia
ef9e3e73ed MDEV-30651: Assertion `sel->quick' in make_range_rowid_filters
(Variant for 10.6: return error code from SQL_SELECT::test_quick_select)
The optimizer deals with Rowid Filters this way:

1. First, range optimizer is invoked. It saves information
   about all potential range accesses.
2. A query plan is chosen. Suppose, it uses a Rowid Filter on
   index $IDX.
3. JOIN::make_range_rowid_filters() calls the range optimizer
again to create a quick select on index $IDX which will be used
to populate the rowid filter.

The problem: KILL command catches the query in step #3. Quick
Select is not created which causes a crash.

Fixed by checking if query was killed.
2024-06-17 12:50:43 +03:00
Marko Mäkelä
a21e49cbcc Merge 11.1 into 11.2 2024-06-17 12:02:03 +03:00
Sergei Petrunia
b47bd3f8bf MDEV-33875: ORDER BY DESC causes ROWID Filter slowdown
Rowid Filter cannot be used with reverse-ordered scans, for the
same reason as IndexConditionPushdown cannot be.

test_if_skip_sort_order() already has logic to disable ICP when
setting up a reverse-ordered scan. Added logic to also disable
Rowid Filter in this case, factored out the code into
prepare_for_reverse_ordered_access(), and added a comment describing
the cause of this limitation.
2024-06-17 09:50:32 +03:00
Marko Mäkelä
d34289a3e2 Merge 10.11 into 11.1 2024-06-17 09:21:50 +03:00
Marko Mäkelä
346a0c1402 Merge 10.6 into 10.11 2024-06-17 09:08:07 +03:00
Marko Mäkelä
e60acae655 Merge 10.5 into 10.6 2024-06-17 08:40:07 +03:00
Monty
956bcf8f49 Change mysqldump to use DO instead of 'SELECT' for storing sequences.
This avoids a lot of SETVAL() results when applying a mysqldump with
sequences.
2024-06-16 10:51:33 +03:00
Monty
fef32fd9ad MDEV-34406 Enhance mariadb_upgrade to print failing query in case of error
To make this possible, it was also necessary to enhance the mariadb
client with the option --print-query-on-error.
This option can also be very useful when running a batch of queries
through the mariadb client and one wants to find out where things goes
wrong.

TODO: It would be good to enhance mariadb_upgrade to not call the mariadb
client for executing queries but instead do this internally.  This
would have made this patch much easier!

Reviewed by: Sergei Golubchik <serg@mariadb.com>
2024-06-16 10:51:33 +03:00
Marko Mäkelä
4b4c371fe7 MDEV-34297 fixup: -Wconversion on 32-bit 2024-06-14 13:21:19 +03:00
Thirunarayanan Balathandayuthapani
3271588bb7 MDEV-34381 During innodb_undo_truncate=ON recovery, InnoDB may fail to shrink undo* files
- During recovery, InnoDB may fail to shrink the undo tablespaces
when there are no pages to recover while applying the redo log.
This issue exists only when innodb_undo_truncate is enabled.
trx_lists_init_at_db_start() could've applied the redo logs
for undo tablespace page0.
2024-06-14 12:46:02 +05:30
Marko Mäkelä
32202c30bc Merge 10.5 into 10.6 2024-06-13 19:58:11 +03:00
Marko Mäkelä
c849952b71 MDEV-33840: Fix GCC -Wreorder
This fixes up the merge commit 829cb1a49c
2024-06-13 19:57:40 +03:00
Marko Mäkelä
dd13243b0d MDEV-33161 fixup: CMAKE_CXX_FLAGS=-DEXTRA_DEBUG 2024-06-13 19:42:18 +03:00
Oleksandr Byelkin
c9414ccd67 Move debug dependent MDEV-32441 test in separate file 2024-06-13 10:51:24 +02:00
Monty
69c07f70a1 Fix MDEV-32441 stability 2024-06-13 10:32:33 +02:00
Marko Mäkelä
5b89cab44f Merge 10.6 into 10.11 2024-06-13 08:16:49 +03:00
Thirunarayanan Balathandayuthapani
b40f9d3d5c MDEV-34374 Shrinking tablespace logic fails to handle error condition
- InnoDB ignores the error while traversing the used
extents during shrinking process. Made changes in
fsp_traverse_extents() to handle error condition
correctly
2024-06-12 17:36:12 +05:30
Brandon Nesterenko
d3a7e46bb4 MDEV-34365: UBSAN runtime error: call to function io_callback(tpool::aiocb*)
On an UBSAN clang-15 build, if running with UBSAN option
halt_on_error=1 (the issue doesn't show up without it),
MTR fails during mysqld --bootstrap with UBSAN error:

call to function io_callback(tpool::aiocb*) through pointer to incorrect function type 'void (*)(void *)'

This patch corrects the parameter type of io_callback
to match its expected type defined by callback_func,
i.e. (void*).

Reviewed By:
============
<TODO>
2024-06-12 08:39:41 +03:00
Marko Mäkelä
fc9005adc4 Merge 10.5 into 10.6 2024-06-12 07:51:28 +03:00
Thirunarayanan Balathandayuthapani
5b39ded713 MDEV-34156 InnoDB fails to apply the redo log for compressed tablespace
Problem:
=======
During recovery, InnoDB fails to apply the redo log for
compressed tablespace. The reason is that InnoDB assumes
that pages has been freed while applying the redo log for it.
During multiple scan of redo logs, InnoDB stores the freed
page information when it have sufficient buffer pool pages.
Once it ran out of memory, InnoDB doesn't store freed page
information. But InnoDB assigns the freed page ranges to tablespace
in recv_init_crash_recovery_spaces() even though
InnoDB doesn't have complete freed range information.
While applying the redo log, InnoDB wrongly assumes that page
has been freed and it could lead to corruption of tablespace.
This issue is caused by commit 941af1fa58 (MDEV-31803)
and commit 2f9e264781 (MDEV-29911).

Solution:
========
During recovery, set recovery size and freed page information
for all tablespace irrespective of memory.
2024-06-11 16:14:36 +05:30