Commit graph

7698 commits

Author SHA1 Message Date
Marko Mäkelä
232af0c7bf Do not disable --symbolic-links on Valgrind (or MSAN)
The option --symbolic-links was originally disabled by default under
Purify (and later Valgrind) in 51156c5af2
without any explanation.
2022-04-25 09:36:30 +03:00
Brandon Nesterenko
a83c7ab1ea MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state
Problem:
========
If a primary is shutdown during an active semi-sync connection
during the period when the primary is awaiting an ACK, the primary
hard kills the active communication thread and does not ensure the
transaction was received by a replica. This can lead to an
inconsistent replication state.

Solution:
========
During shutdown, the primary should wait for an ACK or timeout
before hard killing a thread which is awaiting a communication. We
extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore
any threads waiting for a semi-sync ACK in phase 1. Then, before
stopping the ack receiver thread, the shutdown is delayed until all
waiting semi-sync connections receive an ACK or time out. The
connections are then killed in phase 2.

Notes:
 1) There remains an unresolved corner case that affects this
patch. MDEV-28141: Slave crashes with Packets out of order when
connecting to a shutting down master. Specifically, If a slave is
connecting to a master which is actively shutting down, the slave
can crash with a "Packets out of order" assertion error. To get
around this issue in the MTR tests, the primary will wait a small
amount of time before phase 1 killing threads to let the replicas
safely stop (if applicable).
 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver
Thread Can Error on COM_QUIT

Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-04-22 12:59:54 -06:00
Marko Mäkelä
fae0ccad6e Merge 10.5 into 10.6 2022-04-21 17:46:40 +03:00
Marko Mäkelä
620c55e708 Merge 10.4 into 10.5 2022-04-21 15:33:50 +03:00
Marko Mäkelä
394784095e Merge 10.3 into 10.4 2022-04-21 11:33:59 +03:00
Sergei Golubchik
bbdec04d59 MDEV-24317 Data race in LOGGER::init_error_log at sql/log.cc:1443 and in LOGGER::error_log_print at sql/log.cc:1181
don't initialize error_log_handler_list in set_handlers()
* error_log_handler_list is initialized to LOG_FILE early, in init_base()
* set_handlers always reinitializes it to LOG_FILE, so it's pointless
* after init_base() concurrent threads start using sql_log_warning,
  so following set_handlers() shouldn't modify error_log_handler_list
  without some protection
2022-04-12 13:07:20 +02:00
Vladislav Vaintroub
5a4a37076d MDEV-10183 implement service_manager_extend_timeout on Windows
The implementation calls SetServiceStatus() with updated
SERVICE_STATUS::dwHint and SERVICE_STATUS::dwCheckpoint
2022-04-11 07:49:43 +02:00
Daniel Black
065f995e6d Merge branch 10.5 into 10.6 2022-03-18 12:17:11 +11:00
Daniel Black
b73d852779 Merge 10.4 to 10.5 2022-03-17 17:03:24 +11:00
Daniel Black
069139a549 Merge 10.3 to 10.4
extra2_read_len resolved by keeping the implementation
in sql/table.cc by exposed it for use by ha_partition.cc

Remove identical implementation in unireg.h
(ref: bfed2c7d57)
2022-03-16 16:39:10 +11:00
Daniel Black
a950086036 Merge 10.2 (part) into 10.3
commit '6de482a6fefac0c21daf33ed465644151cdf879f'

10.3 no longer errors in truncate_notembedded.test
but per comments, a non-crash is all that we are after.
2022-03-15 16:44:52 +11:00
Haidong Ji
114476f2ec MDEV-27978 fix wrong name in error when max_session_mem_used exceeded
Fixed typo in my_malloc_size_cb_func. There is no max-thread-mem-used
sys variable in MariaDB, only max-session-mem-used. The relevant entry
in sys_vars.cc is also fixed.

Added a fallback case in case we could allocate the 256 bytes for the
error message containing the exact setting.
2022-03-08 15:13:09 +11:00
Haidong Ji
120480d15b MDEV-27394 added memset to zero out addr.un.sun_path string
If this char[] is not zeroed out, inaccurate log message described in
MDEV-27394 will show up.
2022-02-22 09:41:16 +11:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Jan Lindström
f43ef9ba3a MDEV-25977 : Warning: Memory not freed: 32 on SET GLOBAL wsrep_sst_auth=USER
Add missing wsrep_sst_auth_free call.
2022-01-18 07:10:48 +02:00
Marko Mäkelä
3f5726768f Merge 10.5 into 10.6 2022-01-04 09:26:38 +02:00
Marko Mäkelä
c9db50b585 Merge 10.4 into 10.5 2022-01-03 07:23:49 +02:00
Monty
a48d2ec866 Add --valgrind to VERSION() string for valgrind builds
Fixes main.sp-no-valgrind for valgrind builds not done with BUILD scripts
2021-12-28 16:37:14 +02:00
Sergei Golubchik
89a0364fc8 MDEV-27304 SHOW ... result columns are right-aligned
--version=value was setting sys_var::CONFIG (meaning, the value
came from the config file), but the filename was left as NULL.
2021-12-27 13:28:25 +01:00
Julius Goryavsky
55bb933a88 Merge branch 10.4 into 10.5 2021-12-26 12:51:04 +01:00
Julius Goryavsky
681b7784b6 Merge branch 10.3 into 10.4 2021-12-25 12:13:03 +01:00
Julius Goryavsky
3376668ca8 Merge branch 10.2 into 10.3 2021-12-23 14:14:04 +01:00
Sergei Golubchik
0745db7179 don't use buffered_option_error_reporter without perfschema
it's not printed, not cleaned up without perfschema,
so isn't supposed to be written into either

this fixes "Memory not freed" warnings when early command line
options produce warnings in non-perfschema builds
2021-12-10 09:40:50 +01:00
Marko Mäkelä
dc8def73f7 Merge 10.5 into 10.6 2021-11-16 16:30:45 +02:00
Vladislav Vaintroub
72afeaf8a5 MDEV-18439 Windows builds should have core_file=1 by default
reapply patch 96b472c0ae
It was not merged correctly into 10.5+
2021-11-13 10:31:46 +01:00
Marko Mäkelä
25ac047baf Merge 10.5 into 10.6 2021-11-09 09:11:50 +02:00
Marko Mäkelä
9c18b96603 Merge 10.4 into 10.5 2021-11-09 08:50:33 +02:00
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Marko Mäkelä
576afceadd MDEV-26966: Remove innodb_force_load_corrupted
MySQL 5.5 in commit 177d8b0c12
introduced a configuration parameter innodb_force_load_corrupted
whose purpose was to allow a corrupted table to be dropped.

Given that MDEV-11412 in MariaDB 10.5.4 aims to allow any metadata
for a missing or corrupted table to be dropped, and given that
MDEV-17567 and MDEV-25506 and related tasks made DDL operations
crash-safe, the parameter no longer serves any purpose.

Because this obscure parameter was read-only (not settable by a client),
it seems that we can simply declare it with MARIADB_REMOVED_OPTION
(commit 1bc9cce702) without breaking
any upgrades.

DICT_ERR_IGNORE_INDEX: Replaces DICT_ERR_IGNORE_INDEX_ROOT and
DICT_ERR_IGNORE_CORRUPT, which were always set equally.

dict_load_indexes(): Report "No indexes found for table" in
a uniform way, and only when the DICT_ERR_IGNORE_INDEX flag is
not set.

If the clustered index is marked corrupted, and the operation
is DICT_ERR_IGNORE_DROP (we are about to drop the table), we will
load the metadata; else, we will return DB_INDEX_CORRUPT.

If SYS_INDEXES.PAGE is FIL_NULL, report an error or warning
unless we are about to drop the table.

dict_load_table_one(): Simplify the logic.
2021-11-04 09:55:35 +02:00
Marko Mäkelä
026984c360 MDEV-26949 --debug-gdb installs redundant signal handlers
There is a server startup option --gdb a.k.a. --debug-gdb that requests
signals to be set for more convenient debugging. Most notably, SIGINT
(ctrl-c) will not be ignored, and you will be able to interrupt the
execution of the server while GDB is attached to it.

When we are debugging, the signal handlers that would normally display
a terse stack trace are useless.

When we are debugging with rr, the signal handlers may interfere with
a SIGKILL that could be sent to the process by the environment, and ruin
the rr replay trace, due to a Linux kernel bug
https://lkml.org/lkml/2021/10/31/311

To be able to diagnose bugs in kill+restart tests, we may really need
both a trace before the SIGKILL and a trace of the failure after a
subsequent server startup. So, we had better avoid hitting the problem
by simply not installing those signal handlers.
2021-11-01 10:29:58 +02:00
Marko Mäkelä
a8379e53e8 Merge 10.5 into 10.6
The changes to galera.galear_var_replicate_myisam_on
in commit d9b933bec6
are omitted due to conflicts
with commit 27d66d644c.
2021-10-13 13:28:12 +03:00
Marko Mäkelä
99bb3fb656 Merge 10.4 into 10.5 2021-10-13 12:33:56 +03:00
Marko Mäkelä
a736a3174a Merge 10.3 into 10.4 2021-10-13 12:03:32 +03:00
Marko Mäkelä
4a7dfda373 Merge 10.2 into 10.3 2021-10-13 11:38:21 +03:00
Alexander Barkov
eadd878808 MDEV-23269 SIGSEGV in ft_boolean_check_syntax_string on setting ft_boolean_syntax
The crash happened because my_isalnum() does not support character
sets with mbminlen>1.

The value of "ft_boolean_syntax" is converted to utf8 in do_string_check().
So calling my_isalnum() is combination with "default_charset_info" was wrong.

Adding new parameters (size_t length, CHARSET_INFO *cs) to
ft_boolean_check_syntax_string() and passing self->charset(thd)
as the character set.
2021-10-11 17:43:23 +04:00
Marko Mäkelä
f59f5c4a10 Revert MDEV-25114
Revert 88a4be75a5 and
9d97f92feb, which had been
prematurely pushed by accident.
2021-09-24 16:21:20 +03:00
sjaakola
88a4be75a5 MDEV-25114 Crash: WSREP: invalid state ROLLED_BACK (FATAL)
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This patch also fixes mutex locking order and unprotected
THD member accesses on bf aborting case. We try to hold
THD::LOCK_thd_data during bf aborting. Only case where it
is not possible is at wsrep_abort_transaction before
call wsrep_innobase_kill_one_trx where we take InnoDB
mutexes first and then THD::LOCK_thd_data.

This will also fix possible race condition during
close_connection and while wsrep is disconnecting
connections.

Added wsrep_bf_kill_debug test case

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-09-24 09:47:31 +03:00
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
Rucha Deodhar
beb401b25f This commit is a fixup for MDEV-22189.
Changes:
changed mysqld -> mariadbd for some more error messages that were left.
2021-07-26 22:59:10 +05:30
Sergei Golubchik
4533e6ef65 MDEV-18353 Shutdown may miss to wait for connection thread
* count CONNECT objects too
* wait for all CONNECT objects to disappear (to be converted to THDs)
  before killing THDs away
2021-07-24 15:08:08 +02:00
Sergei Golubchik
b34cafe9d9 cleanup: move thread_count to THD_count::value()
because the name was misleading, it counts not threads, but THDs,
and as THD_count is the only way to increment/decrement it, it
could as well be declared inside THD_count.
2021-07-24 12:37:50 +02:00
Marko Mäkelä
0ad8a825a8 Merge 10.5 into 10.6 2021-07-02 17:00:05 +03:00
Marko Mäkelä
15dcb8bd3e Merge 10.4 into 10.5 2021-07-02 13:02:26 +03:00
Jan Lindström
a1e2ca057d MDEV-26030 : Warning: Memory not freed: 32 on setting wsrep_sst_auth
Call to wsrep_sst_auth_free() was missing from normal shutdown.
2021-06-30 10:38:44 +03:00
Monty
93f5d40656 Fixed debug_sync timeout in deadlock_drop_table
The issue was that we sent two different signals to different threads
after each other. The DEBUG_SYNC functionality cannot handle this (as
the signal is stored in a global variable) and the first one can get
lost.

Fixed by using the same signal for both threads.
2021-06-19 03:46:00 +03:00
Vladislav Vaintroub
3d6eb7afcf MDEV-25602 get rid of __WIN__ in favor of standard _WIN32
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)

Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
2021-06-06 13:21:03 +02:00
Vladislav Vaintroub
3dd91241cb MDEV-24985 Shutdown fails to abort current InnoDB lock waits
On shutdown, to kill remaining connections, do the same thing that
server does during KILL CONNECTION, i.e thd->awake().

The stripped-down KILL version, that was used prior to this patch
for shutdown, missed the engine specific part (ha_kill_query)
2021-06-02 09:04:00 +02:00