The InnoDB DATA DIRECTORY attribute is not implemented via
symbolic links but something similar, *.isl files that contain
the names of data files.
InnoDB failed to ignore the DATA DIRECTORY attribute even though
the server was started with --skip-symbolic-links.
Native ALTER TABLE in InnoDB will retain the DATA DIRECTORY attribute
of the table, no matter if the table will be rebuilt or not.
Generic ALTER TABLE (with ALGORITHM=COPY) as well as TRUNCATE TABLE
will discard the DATA DIRECTORY attribute.
All tests have been run with and without the ./mtr option
--mysqld=--skip-symbolic-links
and some tests that use the InnoDB DATA DIRECTORY attribute
have been adjusted for this.
.. to be the same as startup.
In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.
The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.
The resulting minimum innodb buffer pool sizes are:
Page Size, Previously minimum (startup), with change.
4k 5M 2M
8k 5M 3M
16k 5M 5M
32k 24M 10M
64k 24M 20M
With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.
The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.
Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
minimums. Chunk size also needed increase as the test was for
pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
innodb.restart.opt test so that under debug mode, resizing of the
innodb buffer pool can occur.
The column INFORMATION_SCHEMA.INNODB_LOCKS.LOCK_DATA
would report NULL when the page that contains the locked
record does not reside in the buffer pool.
Pages may be evicted from the buffer pool due to some background
activity, such as the purge of transaction history loading
undo log pages to the buffer pool. The regression tests intentionally
run with a small buffer pool size setting.
To prevent the intermittent test failures, we will filter out the
contents of the LOCK_DATA column from the output.
First, we do not add VERS_UPDATE_UNVERSIONED_FLAG for system field and
that fixes SHOW CREATE result.
Second, we have to call check_sys_fields() for any CREATE TABLE and
there correct type is checked for system fields.
Third, we update system_time like as_row structures for ALTER TABLE
and that makes check_sys_fields() happy for ALTER TABLE when we make
system fields hidden.
Update was skipped (need_update was false) because compare_record()
used HA_PARTIAL_COLUMN_READ branch and it skipped row_start check
has_explicit_value() was false. When we set bit for row_start in
has_value_set the row is updated with new row_start value.
The bug was caused by combination of MDEV-23446 and 3789692d17. The
latter one says:
... But generated columns that are written to the table are always
deterministic and cannot change unless normal non-generated columns
were changed. ...
Since MDEV-23446 generated row_start can change while non-generated
columns are not changed.
Explicit value flag came from HAS_EXPLICIT_DEFAULT which was used to
distinguish default-generated value from user-supplied one.
LIMIT history switching requires the number of history partitions to
be marked for read: from first to last non-empty plus one empty. The
least we can do is to fail with error message if the needed partition
was not marked for read. As this is handler interface we require new
handler error code to display user-friendly error message.
Switching by INTERVAL works out-of-the-box with
ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET error.
For MERGE-tables we need to init children list before calling
show_create_table and then detach children before we continue
normal mysql_create_like_table execution.
Problem:
========
A slave’s relay log format description event is used when
calculating Seconds_Behind_Master (SBM). This forces the SBM
value to spike when processing these events, as their creation
date is set to the timestamp that the IO thread begins.
Solution:
========
When the slave generates a format description event, mark the
event as a relay log event so it does not update the
rli->last_master_timestamp variable.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
CREATE-OR-REPLACE SEQUENCE is not logged with Gtid event DDL flag
which affects its slave parallel execution.
Unlike other DDL:s it can occur in concurrent execution with following transactions
which can lead to various errors, including asserts like
(mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks(this)
in MDL_context::acquire_lock.
Fixed to wrap internal statement level commit with save-
and-restore of TRANS_THD::m_unsafe_rollback_flags.
failed in Diagnostics_area::set_ok_status in my_ok from
mysql_sql_stmt_prepare
Analysis: Before PREPARE is executed, binlog_format is STATEMENT.
This PREPARE had SET STATEMENT which sets binlog_format to ROW. Now after
PREPARE is done we reset the binlog_format (back to STATEMENT). But we have
temporary table, it doesn't let changing binlog_format=ROW to
binlog_format=STATEMENT and gives error which goes unreported. This
unreported error eventually causes assertion failure.
Fix: Change return type for LEX::restore_set_statement_var() to bool and
make it return error state.
This is the first part of the fixes for MDEV-24097. This commit
contains the fixes for instability when testing Galera and when
restarting nodes quickly:
1) Protection against a "stuck" old SST process during the execution
of the new SST (after restarting the node) is now implemented for
mariabackup / xtrabackup, which should help to avoid almost all
conflicts due to the use of the same ports - both during testing
with mtr, so and when restarting nodes quickly in a production
environment.
2) Added more protection to scripts against unexpected return of
the rc != 0 (in the commands for deleting temporary files, etc).
3) Added protection against unexpected crashes during binlog transfer
(in SST scripts for rsync).
4) Spaces and some special characters in binlog filenames shouldn't
be a problem now (at the script level).
5) Daemon process termination tracking has been made more robust
against crashes due to unexpected termination of the previous SST
process while new scripts are running.
6) Reading ssl encryption parameters has been moved from specific
SST scripts to a common wsrep_sst_common.sh script, which allows
unified error handling, unified diagnostics and simplifies script
revisions in the future.
7) Improved diagnostics of errors related to the use of openssl.
8) Corrections have been made for xtrabackup-v2 (both in tests and in
the script code) that restore the work of xtrabackup with updated
versions of innodb.
9) Fixed some tests for galera_3nodes, although the complete solution
for the problem of starting three nodes at the same time on fast
machines will be done in a separate commit.
No additional tests are required as this commit fixes problems with
existing tests.
Since commit fb335b48b5 we may have
a null pointer in purge_sys.query when fetch_data_into_cache() is
invoked and innodb_force_recovery>4. This is because the call to
purge_sys.create() would be skipped.
fetch_data_into_cache(): Load the purge_sys pseudo transaction pointer
to a local variable (null pointer if purge_sys is not initialized).
MDEV-25803 excluded some cases from key sort upon alter table. That
particularly depends on ALTER_ADD_INDEX flag. Creating a column of
SERIAL data type missed that flag. Though equivalent operation
alter table t1 add x bigint unsigned not null auto_increment unique;
has ALTER_ADD_INDEX flag.
create_log_files(): Check log_set_capacity() before modifying
or creating any log files.
innobase_start_or_create_for_mysql(): If create_log_files()
fails and we were initializing a new database, delete the
system tablespace files before exiting.
1. Galera SST scripts should use ssl_capath (not ssl_ca) for CA
directory. The current implementation tries to automatically
detect the path using the trailing slash in the ssl_ca variable
value, but this approach is not compatible with the server
configuration. Now, by analogy with the server, SST scripts
also use a separate ssl_capath variable. In addition, a similar
tcapath variable has been added for the old-style configuration
(in the "sst" section).
2. Openssl utility detection made more reliable.
3. Removed extra spaces in automatically generated command lines -
to simplify debugging of the SST scripts.
4. In general, the code for detecting the presence or absence of
auxiliary utilities has been improved - it is made more reliable
in some configurations (and for shells other than bash).
1. Galera SST scripts should use ssl_capath (not ssl_ca) for CA
directory. The current implementation tries to automatically
detect the path using the trailing slash in the ssl_ca variable
value, but this approach is not compatible with the server
configuration. Now, by analogy with the server, SST scripts
also use a separate ssl_capath variable. In addition, a similar
tcapath variable has been added for the old-style configuration
(in the "sst" section).
2. Openssl utility detection made more reliable.
3. Removed extra spaces in automatically generated command lines -
to simplify debugging of the SST scripts.
4. In general, the code for detecting the presence or absence of
auxiliary utilities has been improved - it is made more reliable
in some configurations (and for shells other than bash).
Currently, SST scripts assume that the filename specified in
the --log-bin-index argument either does not contain an extension
or uses the standard ".index" extension. Similar assumptions are
used for the log_bin_index parameter read from the configuration
file. This commit adds support for arbitrary extensions for the
index file paths.
If the server is started with the --innodb-force-recovery argument
on the command line, then during SST this argument can be passed to
mariabackup only at the --prepare stage, and accordingly it must be
removed from the --mysqld-args list (and it is not should be passed
to mariabackup otherwise).
This commit fixes a flaw in the SST scripts and add a test that
checks the ability to run the joiner node in a configuration that
uses --innodb-force-recovery=1.
fil_space_decrypt(): change signature to return status via dberr_t only.
Also replace impossible condition with an assertion and prove it via
test cases.
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.
Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.
versioning_fields flag indicates that any columns were specified WITH
SYSTEM VERSIONING. In that case we imply WITH SYSTEM VERSIONING for
the whole table and WITHOUT SYSTEM VERSIONING for the other columns.
When restoring lastinx last_key.keyinfo must be updated as well. The
good example is in _ma_check_index().
The point of failure is extra(HA_EXTRA_NO_KEYREAD) in
ha_maria::get_auto_increment():
1. extra(HA_EXTRA_KEYREAD) saves lastinx;
2. maria_rkey() changes index, so the lastinx and last_key.keyinfo;
3. extra(HA_EXTRA_NO_KEYREAD) restores lastinx but not
last_key.keyinfo.
So we have discrepancy between lastinx and last_key.keyinfo after 3.
Replaced HA_ADMIN_NOT_IMPLEMENTED error code by HA_ADMIN_OK. Now CHECK
TABLE does not fail by unsupported check_misplaced_rows(). Admin
message is not needed as well.
Test case is the same as for MDEV-21011 (a7cf0db3d8), the result have
been changed.
strmake() puts one extra 0x00 byte at the end of the string.
The code in my_strnxfrm_tis620[_nopad] did not take this into
account, so in the reported scenario the 0x00 byte was put outside
of a stack variable, which made ASAN crash.
This problem is already fixed in in MySQL:
commit 19bd66fe43c41f0bde5f36bc6b455a46693069fb
Author: bin.x.su@oracle.com <>
Date: Fri Apr 4 11:35:27 2014 +0800
But the fix does not seem to be correct, as it breaks when finds a zero byte
in the source string.
Using memcpy() instead of strmake().
- Unlike strmake(), memcpy() it does not write beyond the destination
size passed.
- Unlike the MySQL fix, memcpy() does not break on the first 0x00 byte found
in the source string.
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is
wsrep_thd_LOCK()
wsrep_kill_victim()
lock_rec_other_has_conflicting()
lock_clust_rec_read_check_and_lock()
row_search_mvcc()
ha_innobase::index_read()
ha_innobase::rnd_pos()
handler::ha_rnd_pos()
handler::rnd_pos_by_record()
handler::ha_rnd_pos_by_record()
Rows_log_event::find_row()
Update_rows_log_event::do_exec_row()
Rows_log_event::do_apply_event()
Log_event::apply_event()
wsrep_apply_events()
and mutexes are taken in the order
lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
When a normal KILL statement is executed, the stack is
innobase_kill_query()
kill_handlerton()
plugin_foreach_with_mask()
ha_kill_query()
THD::awake()
kill_one_thread()
and mutexes are
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.
In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.
TOI replication is used, in this approach, purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.
This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is
wsrep_thd_LOCK()
wsrep_kill_victim()
lock_rec_other_has_conflicting()
lock_clust_rec_read_check_and_lock()
row_search_mvcc()
ha_innobase::index_read()
ha_innobase::rnd_pos()
handler::ha_rnd_pos()
handler::rnd_pos_by_record()
handler::ha_rnd_pos_by_record()
Rows_log_event::find_row()
Update_rows_log_event::do_exec_row()
Rows_log_event::do_apply_event()
Log_event::apply_event()
wsrep_apply_events()
and mutexes are taken in the order
lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
When a normal KILL statement is executed, the stack is
innobase_kill_query()
kill_handlerton()
plugin_foreach_with_mask()
ha_kill_query()
THD::awake()
kill_one_thread()
and mutexes are
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.
In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.
TOI replication is used, in this approach, purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.
This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.
Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.