Commit graph

195472 commits

Author SHA1 Message Date
Alexey Yurchenko
9d7e596ba6 MDEV-26971: JSON file interface to wsrep node state.
Integration with status reporter in wsrep-lib.

Status reporter reports changes in wsrep state and logged errors/
warnings to a json file which then can be read and interpreted by
an external monitoring tool.

Rationale: until the server is fully initialized it is unaccessible
by client and the only source of information is an error log which
is not machine-friendly. Since wsrep node can spend a very long time
in initialization phase (state transfer), it may be a very long time
that automatic tools can't easily monitor its liveness and progression.

New variable: wsrep_status_file specifies the output file name.
If not set, no file is created and no reporting is done.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 16:38:41 +01:00
Jan Lindström
d526551587 Update wsrep-lib submodule 2022-03-18 16:38:41 +01:00
Marko Mäkelä
f6fcf827b3 MDEV-28111 fixup: Correct the start-up message
By default, we should say "InnoDB: Buffered log writes" instead of
"InnoDB: File system buffers for log disabled" if we are able to
determine the physical block size on Linux.
2022-03-18 15:27:57 +02:00
Monty
74e668eaeb Fixed warning for maria.maria-recovery2 about crashed table
The bug was a missing va_start in eprint() which caused a wrong table
name to be printed.
Patch backported from 10.3.
2022-03-18 13:26:50 +02:00
Sergei Golubchik
10d9b890b0 Merge branch '10.8' into 10.9 2022-03-18 11:14:48 +01:00
Marko Mäkelä
8840583a92 MDEV-27909 InnoDB: Failing assertion: state == TRX_STATE_NOT_STARTED ... on DDL
The fix in commit 6e390a62ba (MDEV-26772)
was a step to the right direction, but implemented incorrectly.
When an InnoDB persistent statistics table cannot be locked immediately,
we must not let row_mysql_handle_errors() to roll back the transaction.

lock_table_for_trx(): Add the parameter no_wait (default false)
for an immediate return of DB_LOCK_WAIT in case of a conflict.

ha_innobase::delete_table(), ha_innobase::rename_table():
Pass no_wait=true to lock_table_for_trx() when needed,
instead of temporarily setting THDVAR(thd, lock_wait_timeout) to 0.
2022-03-18 10:52:08 +02:00
Sergei Golubchik
5c11e7eead update test results 2022-03-18 09:47:58 +01:00
Jan Lindström
c519aa3d7a MDEV-24143 : Galera nodes "randomly" crashing in Item_func_release_lock::val_int
Fixed on MDEV-27713. Added additional test case.
2022-03-18 08:30:26 +02:00
mkaruza
507030c492 MDEV-27713 Crash after a conflict of applier thread with stored procedure call by event scheduler
When thread is BF aborted by high priority service, ULL (user level
locks need to be removed and released). Calling directly release of lock for
MDL_EXPLICIT type doesn't clear also `thd->ull_hash`. Method
`mysql_ull_cleanup` will properly clear all information about ULL locks
for thread.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
mkaruza
304f75c973 MDEV-27568 Parallel async replication hangs on a Galera node
Using parallel slave applying can cause deadlock between between DDL and
other events. GTID with lower seqno can be blocked in galera when node
entered TOI mode, but DDL GTID which has higher node can be blocked
before previous GTIDs are applied locally.

Fix is to check prior commits before entering TOI.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
Daniele Sciascia
c63eab2c68 MDEV-28055: Galera ps-protocol fixes
* Fix test galera.MW-44 to make it work with --ps-protocol
* Skip test galera.MW-328C under --ps-protocol This test
  relies on wsrep_retry_autocommit, which has no effect
  under ps-protocol.
* Return WSREP related errors on COM_STMT_PREPARE commands
  Change wsrep_command_no_result() to allow sending back errors
  when a statement is prepared. For example, to handle deadlock
  error due to BF aborted transaction during prepare.
* Add sync waiting before statement prepare
  When a statement is prepared, tables used in the statement may be
  opened and checked for existence. Because of that, some tests (for
  example galera_create_table_as_select) that CREATE a table in one node
  and then SELECT from the same table in another node may result in errors
  due to non existing table.
  To make tests behave similarly under normal and PS protocol, we add a
  call to sync wait before preparing statements that would sync wait
  during normal execution.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:26 +02:00
Daniele Sciascia
39ed400553 Fixup for MDEV-27553
Update wsrep-lib which contains a fixup introduced with MDEV-27553.
Also, adapt the corresponding test: after apply failure on ROLLBACK,
node will disconnect from cluster

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:25 +02:00
sjaakola
97582f1c06 MDEV-27649 PS conflict handling causing node crash
Handling BF abort for prepared statement execution so that EXECUTE processing will continue
until parameter setup is complete, before BF abort bails out the statement execution.

THD class has new boolean member: wsrep_delayed_BF_abort, which is set if BF abort is observed
in do_command() right after reading client's packet, and if the client has sent PS execute command.
In such case, the deadlock error is not returned immediately back to client, but the PS execution
will be started. However, the PS execution loop, will now check if wsrep_delayed_BF_abort is set, and
stop the PS execution after the type information has been assigned for the PS.
With this, the PS protocol type information, which is present in the first PS EXECUTE command, is not lost
even if the first PS EXECUTE command was marked to abort.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:25 +02:00
sjaakola
8e9e1c3979 MDEV-27649 Crash with PS execute after BF abort
This commit contains a test for reproducing the issue in MDEV-27649,
where a transaction, executing a prepared statment, is BF aborted.
The scenario, in MDEV-27649  has a transaction which has prepared a PS,
but not yet executed it, and this transaction is then BF aborted in this state.
When the BF aborted transaction tries to execute the PS, it will receive deadlock error.
But, when it tries to execute the PS second time, the node crashes.

Mtr test galera.galera_bf_abort_ps_bind, exercises this scenario.

However, mtr test platform does not have mechanism to control the execution of PS in required detail.
For this purpose, mysqltetst.cc was extended to contain 4 new commands:
PS_prepare   - to prepare a prepared statement
PS_bind      - to bind values for parameters for the PS
PS_execute   - to execute the PS
PS_close     - to close the PS

The support for controlling prepared statments in mtr scripts is quite minimal
in this commit. Limitations are:
* only one PS can be used by a connection, at a time
* only input parameters can be bound for the PS
* only varchar, integer or float type of parameters can be bound

added the result

fixes

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2022-03-18 08:30:18 +02:00
Otto Kekäläinen
9595ea8992 Deb: Sync Salsa-CI from Debian MariaDB 10.5 repository
Since Debian Sid now has MariaDB 10.6, we can't do any upgrade tests in
Debian Sid for the 10.5 branch anymore. It would just fail with downgrade
errors.

Also, since MariaDB 10.5 is no longer in Sid, we can't even test 10.5.x
to 10.5.y upgrades in Sid.

Instead the 10.5 branch salsa-ci.yml should run all builds and tests based
on Debian Bullseye, which has MariaDB 10.5 (only).

To achieve this, essentially sync most the the salsa-ci.yml contents from
https://salsa.debian.org/mariadb-team/mariadb-10.5/-/tree/bullseye

Also add a couple Lintian overrides to make Salsa-CI pass.

NOTE TO MERGERS: This commit is intended for the 10.5 branch only, do not
merge anything from it on 10.6 or any other branch.
2022-03-17 20:41:50 -07:00
Daniel Black
065f995e6d Merge branch 10.5 into 10.6 2022-03-18 12:17:11 +11:00
Sergei Golubchik
ecb6f9c894 MDEV-28095 crash in multi-update and implicit grouping
disallow implicit grouping in multi-update.
explicit GROUP BY is not allowed by the grammar.
2022-03-17 16:58:48 +01:00
Sergei Golubchik
df2a8d728c Merge branch '10.8' into 10.9 2022-03-17 14:38:35 +01:00
Sergei Golubchik
9aea73f74f Merge branch '10.7' into 10.8 2022-03-17 12:18:40 +01:00
Sergei Golubchik
bf8dc0be9e fix columnstore compilation after 33c30da165
normally, one has to include my_global.h before
including psi/psi*.h files. ColumnStore cannot do it,
so it needs a workaround.
2022-03-17 12:13:19 +01:00
Marko Mäkelä
c4c8830709 MDEV-28111 Redo log writes are being buffered on Linux for no good reason
In commit 685d958e38 (MDEV-14425)
we ended up not enabling O_DIRECT writes on the redo log
by default, because back then, it was slightly slower on
some systems.

With commit a635c40648 (MDEV-27774)
the situation changed. A new test on a NVMe device shows 9%
improvement in throughput and over 15% reduction of latency
when O_DIRECT writes are enabled.

With this change, all the following settings will use O_DIRECT
on InnoDB data and log files:

innodb_flush_method=O_DIRECT
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_flush_method=O_DSYNC

Before MDEV-14425, log writes were always buffered on Linux.
Between MDEV-14425 and this change, unbuffered log writes
were only enabled for innodb_flush_method=O_DSYNC.
2022-03-17 12:00:00 +02:00
Alexander Barkov
22fd31c588 MDEV-28078 Garbage on multiple equal ENUMs with tricky character sets
TYPELIBs for ENUM/SET columns could erroneously undergo redundant
hex-unescaping at the table open time.

Fix:
- Prevent multiple unescaping of the same TYPELIB
- Prevent sharing TYPELIBs between columns with different mbminlen
2022-03-17 13:05:03 +04:00
Marko Mäkelä
06e3bc4390 MDEV-17841 fixup: GCC -Wmaybe-uninitialized
In commit ab38b7511b
an added "goto err" would seemingly cause a read of
an uninitialized variable old_info if errpos>=5.

However, because we would have errpos=0 at that point,
there was no real error.
2022-03-17 10:33:06 +02:00
Marko Mäkelä
118826d173 Fix gcc-12 -O2 -Warray-bounds 2022-03-17 10:20:07 +02:00
Marko Mäkelä
75e39f3cba Fix gcc-12 -O2 -Wmaybe-uninitialized 2022-03-17 10:13:50 +02:00
Marko Mäkelä
9350945023 Merge 10.8 into 10.9 2022-03-17 09:59:37 +02:00
Marko Mäkelä
86820837cb MDEV-28043 fixup: GCC -m32 -Wconversion 2022-03-17 09:40:46 +02:00
Daniel Black
b73d852779 Merge 10.4 to 10.5 2022-03-17 17:03:24 +11:00
Marko Mäkelä
ee80c19633 MDEV-26551 InnoDB crash on multiple concurrent SHOW TABLE STATUS
dict_get_and_save_data_dir_path(): Protect the operation with
dict_table_t::lock_mutex and avoid unnecessary memory allocation.
2022-03-16 17:19:13 +02:00
Thirunarayanan Balathandayuthapani
31ad9277fe MDEV-28079 Shutdown hangs after altering innodb partition fts table
- InnoDB purge waits at resume_FTS() while shutting down.
This happens after altering the FTS innodb partition table.
stop_FTS() has been called for each partition, but it calls
resume_FTS() only once and it leads to hang during shutdown.
This issue was introduced by
commit 1bd681c8b3c5213ce1f7976940a7dc38b48a0d39(MDEV-25506).
2022-03-16 19:20:27 +05:30
Marko Mäkelä
0f56e21efa MDEV-28091 PERFORMANCE_SCHEMA unit tests fail due to memory misalignment
Let us make the mocked-up pfs_malloc() return aligned memory, just
like the actual implementation does.
2022-03-16 11:49:47 +02:00
Marko Mäkelä
5be92887c2 Merge 10.8 into 10.9 2022-03-16 09:14:11 +02:00
Marko Mäkelä
f7f7a3238e MDEV-28051 Error compiling libmariadb/plugins/compress/c_zstd.c 2022-03-16 09:10:16 +02:00
Marko Mäkelä
00b88376e1 MDEV-27812 fixup: Use log_sys.get_block_size()
The data member log_sys.block_size is only defined in environments
where we can determine the physical block size of a file
(currently, Linux and Microsoft Windows). The accessor function
get_block_size() is available everywhere.

Thanks to Dmitry Shulga for reporting the build failure.
2022-03-16 08:41:56 +02:00
Daniel Black
069139a549 Merge 10.3 to 10.4
extra2_read_len resolved by keeping the implementation
in sql/table.cc by exposed it for use by ha_partition.cc

Remove identical implementation in unireg.h
(ref: bfed2c7d57)
2022-03-16 16:39:10 +11:00
Daniel Black
6a2d88c132 Merge 10.2 to 10.3 2022-03-16 12:51:22 +11:00
Alexander Barkov
0e63023cb8 Merge branch 10.2 into 10.3 2022-03-16 12:49:13 +11:00
Daniel Black
b2c81e06b0 MDEV-27955 main.func_json_notembedded test fails on out-of-memory
Uses 500M+ of memory by repeating an 8 byte sequence 62.5M times.

Reduce the number of repeats on string reduced by 100 times.

Tested by applying against the reverted MDEV-24909 code. 1000 times
reduction was too much, but 100 still managed to trigger the bug.
2022-03-16 09:41:54 +11:00
Daniel Black
57dbe8785d MDEV-23915 ER_KILL_DENIED_ERROR not passed a thread id (part 2)
Per Marko's comment in JIRA, sql_kill is passing the thread id
as long long. We change the format of the error messages to match,
and cast the thread id to long long in sql_kill_user.
2022-03-16 09:37:45 +11:00
Daniel Black
99837c61a6 MDEV-23915 ER_KILL_DENIED_ERROR not passed a thread id
The 10.5 test error main.grant_kill showed up a incorrect
thread id on a big endian architecture.

The cause of this is the sql_kill_user function assumed the
error was ER_OUT_OF_RESOURCES, when the the actual error was
ER_KILL_DENIED_ERROR. ER_KILL_DENIED_ERROR as an error message
requires a thread id to be passed as unsigned long, however a
user/host was passed.

ER_OUT_OF_RESOURCES doesn't even take a user/host, despite
the optimistic comment. We remove this being passed as an
argument to the function so that when MDEV-21978 is implemented
one less compiler format warning is generated (which would
have caught this error sooner).

Thanks Otto for reporting and Marko for analysis.
2022-03-16 09:37:45 +11:00
Marko Mäkelä
1ecf173741 Merge 10.8 into 10.9 2022-03-15 18:26:29 +02:00
Marko Mäkelä
9f5a3e5689 Merge 10.7 into 10.8 2022-03-15 18:18:07 +02:00
Marko Mäkelä
dc4b7f382b Merge 10.6 into 10.7 2022-03-15 15:25:31 +02:00
Marko Mäkelä
4ef44cc2f9 Merge 10.5 into 10.6 2022-03-15 14:49:24 +02:00
Marko Mäkelä
73fee39ea6 MDEV-27985 buf_flush_freed_pages() causes InnoDB to hang
buf_flush_freed_pages(): Assert that neither buf_pool.mutex
nor buf_pool.flush_list_mutex are held. Simplify the loops.
Return the tablespace and the number of pages written or punched.

buf_flush_LRU_list_batch(), buf_do_flush_list_batch():
Release buf_pool.mutex before invoking buf_flush_space().

buf_flush_list_space(): Acquire the mutexes only after invoking
buf_flush_freed_pages().

Reviewed by: Thirunarayanan Balathandayuthapani
2022-03-15 14:44:22 +02:00
Marko Mäkelä
de4ec44b4f Fix clang -Wtypedef-redefinition
Include my_global.h before mysql.h so that the latter will not
redefine my_socket and my_ulonglong.

Fixup for commit 77c184df7c
2022-03-15 12:57:39 +02:00
Marko Mäkelä
8575d2fb39 MDEV-28043 Race condition between mtr_t::commit() and checkpoint
In commit a635c40648 (MDEV-27774)
a race condition was introduced between mtr_t::commit() and
a log checkpoint.

Between the time of assigning the log sequence number and adding
the changed pages to buf_pool.flush_list, the log_sys.latch must
be continuously held by the current thread, or otherwise a
log checkpoint could get the wrong result from
buf_pool.get_oldest_modification().

buf_pool_t::insert_into_flush_list(): Add a debug assertion for
increasing the probability of cathing this type of problem.

mtr_t::m_latch_ex: A flag that indicates whether the mini-transaction
is holding log_sys.latch in exclusive mode.

mtr_t::do_write(), mtr_t::finish_write(): Remove the parameter
"bool ex" and refer to m_latch_ex instead.

mtr_t::commit(): Release log_sys.latch according to m_latch_ex.

mtr_t::commit_shrink(), mtr_t::commit_files(): Set m_latch_ex.

mtr_t::do_write(): Do not release an exclusive log_sys.latch,
but instead set m_latch_ex if needed.
2022-03-15 12:35:40 +02:00
Marko Mäkelä
00896db1c5 MDEV-25214 Crash in fil_space_t::try_to_close
fil_space_t::try_to_close(): Tolerate a tablespace that has no
data files attached. The function fil_ibd_create() initially
creates and attaches a tablespace with no files, and invokes
fil_space_t::add() later.

fil_node_open_file(): After releasing and reacquiring fil_system.mutex,
check if the file was already opened by another thread. This avoids
an assertion failure !node->is_open() in fil_node_open_file_low().

These failures were reproduced with the test
innodb.table_definition_cache_debug and the fix of MDEV-27985.
2022-03-15 10:37:13 +02:00
Marko Mäkelä
e1246775a9 Merge 10.4 into 10.5 2022-03-15 08:32:28 +02:00
Marko Mäkelä
9c6135e81f Merge 10.3 into 10.4 2022-03-15 08:10:35 +02:00