Variable wsrep_forced_binlog_format has higher priority than
binlog_format. In situation where STATEMENT is used and DELAYED INSERT
is executing we should fall back to non-delay INSERT.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Using `innodb_thread_concurrency` will call `wsrep_thd_is_aborting` to
check WSREP thread state. This call should be protected by taking
`LOCK_thd_data` before entering function.
Applier and TOI threads should no be affected with usage of
`innodb_thread_concurrency` variable so returning before any checks.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
If a table has no unique indexes, write set key information will be collected on all columns in the table.
The write set key information has space only for max 3500 bytes for individual column, and if a varchar colummn of such non-primary key table is longer than
this limit, currently a crash follows.
The fix in this commit, is to truncate key values extracted from such long varhar columns to max 3500 bytes.
This may potentially lead to false positive certification failures for transactions, which operate on separate cluster nodes, and update/insert/delete table rows, which differ only in the part of such long columns after 3500 bytes border.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
In commit 45ed9dd957 (MDEV-23855)
when removing fil_system.LRU we failed to rate-limit the output
for reporting violations of innodb_open_files or open_files_limit.
If the server is run with a small limit of open files that is
well below the number of .ibd files that are being accessed by
the workload, and if at the same time innodb_log_file_size is
very small so that log checkpoints will occur frequently,
the process of enforcing the open files limit may be run very often.
fil_space_t::try_to_close(): Display at most one message per call,
and only if at least 5 seconds have elapsed since the last time a
message was output.
fil_node_open_file(): Only output a summary message if
fil_space_t::try_to_close() displayed a message during this run.
(Note: multiple threads may execute fil_node_open_file() on
different files at the same time.)
fil_space_t::get(): Do not dereference a null pointer if n & STOPPING.
This was caught by the test case below.
Unfortunately, it is not possible to create a fully deterministic
test case (expecting exactly 1 message to be emitted). The following with
--innodb-open-files=10 --innodb-log-file-size=4m
would occasionally fail to find the message in the log:
--source include/have_innodb.inc
--source include/have_partition.inc
--source include/have_sequence.inc
call mtr.add_suppression("InnoDB: innodb_open_files=10 is exceeded");
CREATE TABLE t1 (pk INT AUTO_INCREMENT PRIMARY KEY) ENGINE=InnoDB
PARTITION BY key (pk) PARTITIONS 100;
INSERT INTO t1 SELECT * FROM seq_1_to_100;
--disable_query_log
let $n=400;
while ($n)
{
BEGIN; DELETE FROM t1; ROLLBACK;
dec $n;
}
--enable_query_log
let SEARCH_FILE= $MYSQLTEST_VARDIR/log/mysqld.1.err;
let SEARCH_PATTERN= \[Note\] InnoDB: Cannot close file;
-- source include/search_pattern_in_file.inc
DROP TABLE t1;
Normally we disable caching of routines in "SHOW CREATE".
Introduce an exception, if debug_dbug="+d,cache_sp_in_show_create".
lock_sync.test needs a way to populate the cache without side effects,
or else it runs into debug_sync timeouts.
So, this possibility to cache will be remain only for very special tests.
Occasionally, after restart, additional transactions will have been
executed, possibly related to innodb_stats_auto_recalc.
We should only care that the transaction ID sequence does
not go backwards.
Let us mask the actual values of the defragmentation-related fields,
because they may vary. Also, remove the dependency on purge,
and instead delete records by a ROLLBACK of INSERT.
The reason for this behavior is that SP get cached, per connection.
The stored_program_cache is size of this cache, which amounts to 256
routines by default. A compiled stored procedure can easily be several
megabytes in size. Thus calling SHOW CREATE PROCEDURE for all stored
procedures, like mysqldump does, can require significant amount of memory.
Fixed by bypassing the cache for "SHOW CREATE". This should normally be
fine also perfomance-wise, as cache is meant to be used for repeated
execution, not repeated SHOW CREATEs.
Added a test to verify that CREATE PROCEDURE + SHOW CREATE PROCEURE do not
cache, i.e amount of allocated memory does not change.
Note, there is a change in existing behavior in an edge case :
If "SHOW CREATE PROCEDURE p1" called from p1, after p1 was altered, now
this will now return altered code. Previour behavior - relied on caching
and would return old code. The previous behavior might was not necessarily
correct.
Use in_sum_func (and so nest_level) only in LEX to which SELECT lex belong to
Reduce usage of current_select (because it does not always point on the correct
SELECT_LEX, for example with prepare.
Change context for all classes inherited from Item_ident (was only for Item_field) in case of pushing down it to HAVING.
Now name resolution context have to have SELECT_LEX reference if the context is present.
Fixed feedback plugin stack usage.
Create minidump when server fails to shutdown. If process is being
debugged, cause a debug break.
Moves some code which is part of safe_kill into mysys, as both safe_kill,
and mysqltest produce minidumps on different timeouts.
Small cleanup in wait_until_dead() - replace inefficient loop with a single
wait.
Fixed flaws with overly strict or, conversely,
overly soft verification of certificates in some
scenarios:
1. Removed the check that the 'commonname' (CN) in the
certificate matches the 'localhost' value on the side
of the joiner node, which was performed earlier, even
if the address was received by the script only as an
argument (out of the exchange via the Galera protocol) -
since for the joining node this argument always contains
its own local address, not the address of the remote host,
so it is always treated as 'localhost', which is not
necessarily true (outside of mtr testing);
2. Removed checking the domain name or IP-address of the
peer node in the encrypt=2 mode;
3. Fixed checking of compliance of certificates when
rsync SST is used;
4. Added the ability to specify CA not only as a file,
but also as a path to the directory where the certificates
are stored. To do this, the user just needs to specify the
path to this directory as the value ssl-ca or tca parameter,
ending with the '/' character.
trx_rseg_header_create(): Add a parameter for the value that is
to be written to TRX_RSEG_MAX_TRX_ID. If we omit this write, then
the updated test innodb.undo_truncate will fail for the 4k, 8k, 16k
page sizes. This was broken ever since
commit 947efe17ed (MDEV-15158)
removed the writes of transaction identifiers to the TRX_SYS page.
srv_do_purge(): Truncate undo tablespaces also during slow shutdown
(innodb_fast_shutdown=0).
Thanks to Krunal Bauskar for noticing this problem.
RedHat systems have both files for lsb and init functions.
Old code was written as if/else, so second file (RedHat-specific) was not processed.
So, systemd redirect didn't work, because its logic is described in
RedHat-specific functions file
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.
In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.
TOI replication is used, in this approach, purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.
This patch also fixes mutex locking order and unprotected
THD member accesses on bf aborting case. We try to hold
THD::LOCK_thd_data during bf aborting. Only case where it
is not possible is at wsrep_abort_transaction before
call wsrep_innobase_kill_one_trx where we take InnoDB
mutexes first and then THD::LOCK_thd_data.
This will also fix possible race condition during
close_connection and while wsrep is disconnecting
connections.
Added wsrep_bf_kill_debug test case
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
trx_purge_truncate_history(): Do not force a write of the undo tablespace
that is being truncated. Instead, prevent page writes by acquiring
an exclusive latch on all dirty pages of the tablespace.
fseg_create(): Relax an assertion that could fail if a dirty undo page
is being initialized during undo tablespace truncation (and
trx_purge_truncate_history() already acquired an exclusive latch on it).
fsp_page_create(): If we are truncating a tablespace, try to reuse
a page that we may have already latched exclusively (because it was
in buf_pool.flush_list). To some extent, this helps the test
innodb.undo_truncate,16k to avoid running out of buffer pool.
mtr_t::commit_shrink(): Mark as clean all pages that are outside the
new bounds of the tablespace, and only add the newly reinitialized pages
to the buf_pool.flush_list.
buf_page_create(): Do not unnecessarily invoke change buffer merge on
undo tablespaces.
buf_page_t::clear_oldest_modification(bool temporary): Move some
assertions to the caller buf_page_write_complete().
innodb.undo_truncate: Use a bigger innodb_buffer_pool_size=24M.
On my system, it would otherwise hang 1 out of 1547 attempts
(on the 40th repeat of innodb.undo_truncate,16k).
Other page sizes were not affected.
At least since commit 055a3334ad
(MDEV-13564) the undo log truncation in InnoDB did not work correctly.
The main issue is that during the execution of
trx_purge_truncate_history() some pages of the newly truncated
undo tablespace could be discarded.
This is improved from commit 1cb218c37c
which was applied to earlier-version branches.
fsp_try_extend_data_file(): Apply the peculiar rounding of
fil_space_t::size_in_header only to the system tablespace,
whose size can be expressed in megabytes in a configuration parameter.
Other files may freely grow by a number of pages.
fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces,
and mention the file name in the error message.
mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace:
(1) durably write the log
(2) release the page latches of the rebuilt tablespace
(3) release the mutexes
(4) truncate the file
(5) release the tablespace latch
This is refactored from trx_purge_truncate_history().
log_write_and_flush_prepare(), log_write_and_flush(): New functions
to durably write log during mtr_t::commit_shrink().
While the redo log is being resized in srv_start(),
we must not write checkpoint information to the old log.
Thanks to Matthias Leich for noticing this.
Actual problem was that we tried to calculate persistent statistics
to wsrep_schema tables in this case wsrep_streaming_log. These tables
should not have persistent statistics. Therefore, in table creation
tables should be created with STATS_PERSISTENT=0 table option. During
rolling-upgrade tables naturally already exists, thus we need to
alter them to contain STATS_PERSISTENT=0 table option.
At least since commit 055a3334ad
(MDEV-13564) the undo log truncation in InnoDB did not work correctly.
The main issue is that during the execution of
trx_purge_truncate_history() some pages of the newly truncated
undo tablespace could be discarded.
fsp_try_extend_data_file(): Apply the peculiar rounding of
fil_space_t::size_in_header only to the system tablespace,
whose size can be expressed in megabytes in a configuration parameter.
Other files may freely grow by a number of pages.
fseg_alloc_free_page_low(): Do allow the extension of undo tablespaces,
and mention the file name in the error message.
mtr_t::commit_shrink(): Implement crash-safe shrinking of a tablespace
file. First, durably write the log, then shrink the file, and finally
release the page latches of the rebuilt tablespace. Refactored from
trx_purge_truncate_history().
log_write_and_flush_prepare(), log_write_and_flush(): New functions
to durably write log during mtr_t::commit_shrink().