extra2_read_len resolved by keeping the implementation
in sql/table.cc by exposed it for use by ha_partition.cc
Remove identical implementation in unireg.h
(ref: bfed2c7d57)
The 10.5 test error main.grant_kill showed up a incorrect
thread id on a big endian architecture.
The cause of this is the sql_kill_user function assumed the
error was ER_OUT_OF_RESOURCES, when the the actual error was
ER_KILL_DENIED_ERROR. ER_KILL_DENIED_ERROR as an error message
requires a thread id to be passed as unsigned long, however a
user/host was passed.
ER_OUT_OF_RESOURCES doesn't even take a user/host, despite
the optimistic comment. We remove this being passed as an
argument to the function so that when MDEV-21978 is implemented
one less compiler format warning is generated (which would
have caught this error sooner).
Thanks Otto for reporting and Marko for analysis.
- InnoDB fails to skip newly created column while checking for
change column when table is in redundant row format. This issue
is caused the MDEV-18035 (ccb1acbd3c)
Let us explicitly wait for purge before invoking a slow shutdown,
so that instrumented builds (such as ASAN or UBSAN) will not
exceed the 60-second timeout during shutdown.
The first step for deprecating innodb_autoinc_lock_mode(see MDEV-27844) is:
- to switch statement binlog format to ROW if binlog format is MIXED and
the statement changes autoincremented fields
- issue warnings if innodb_autoinc_lock_mode == 2 and binlog format is
STATEMENT
The warning out of OPTIMIZE
Statement is unsafe because it uses a system function
was indeed counterfactual and was resulted by checking an
insufficiently strict property of lex' sql_command_flags.
Fixed with deploying an additional checking of weather
the current sql command that modifes a share->non_determinstic_insert
table is capable of generating ROW format events.
The extra check rules out the unsafety to OPTIMIZE et al, while the
existing check continues to do so to CREATE TABLE (which is
perculiarly tagged as ROW-event generative sql command).
As a side effect sql_sequence.binlog test gets corrected and
binlog_stm_unsafe_warning.test is reinforced to add up
an unsafe CREATE..SELECT test.
MDEV-27025 allows to insert records before the record on which DELETE is
locked, as a result the DELETE misses those records, what causes serious ACID
violation.
Revert MDEV-27025, MDEV-27550. The test which shows the scenario of ACID
violation is added.
- InnoDB FTS DDL decrements the FTS_DOC_ID when there is a
deleted marked record involved. FTS_DOC_ID must never be
reused. The purpose of FTS_DOC_ID is to be a unique row
identifier that will be changed whenever a fulltext indexed
column is updated.
btr_cur_optimistic_insert(): Disregard DEBUG_DBUG injection to
invoke btr_page_reorganize() if the page (and the table) is empty.
Otherwise, an assertion would fail in btr_page_reorganize_low()
because PAGE_MAX_TRX_ID is 0 in an empty secondary index leaf page.
- Server incorrectly downgrading the MDL after prepare phase when
table is empty. mdl_exclusive_after_prepare is being set in
prepare phase only. But mdl_exclusive_after_prepare condition was
misplaced and checked before prepare phase by
commit d270525dfd and it is now
changed to check after prepare phase.
- main.innodb_mysql_sync test case was changed to avoid locking
optimization when table is empty.
DEBUG_SYNC signals can get lost in certain tests due to later
DEBUG_SYNC commands overwriting them. This patch addresses
these issues in three tests: main.query_cache_debug,
main.partition_debug_sync, and
rpl.rpl_dump_request_retry_warning.
Additionally, main.partition_debug_sync needed changes to the
result file (the others did not). The synchronization happened
between two commands, one based on ALTER, the other on DROP.
A new thread/connection was needed to synchronize the DEBUG_SYNC
actions between these commands, thereby changing the result file.
Additional comments were added for clarification.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
- Make innodb_ft_cache_size & innodb_ft_total_cache_size are dynamic
variable and increase the maximum value of innodb_ft_cache_size to
512MB for 32-bit system and 1 TB for 64-bit system and set
innodb_ft_total_cache_size maximum value to 1 TB for 64-bit system.
- Print warning if the fts cache exceeds the innodb_ft_cache_size
and also unlock the cache if fts cache memory reduces less than
innodb_ft_cache_size.
This commit adds correct handling of binlogs for SST using rsync
or mariabackup. Before this fix, binlogs were handled incorrectly -
- only one (last) binary log file was transferred during SST, which
then led to various failures (for example, when trying to list all
events from the binary log). These bugs were long masked by flaws
in the primitive binlogs handling code in the SST scripts, which
causing binary logs files to be erased after transfer or not added
to the binlog index on the joiner node. Now the correct transfer
of all binary logs (not just the last of the binary log files) has
been implemented both for the rsync (at the script level) and for
the mariabackup (at the level of the main utility code).
This commit also adds a new sst_max_binlogs=<n> parameter, which
can be located in the [sst] section or in the [xtrabackup] section
(historically, supported for mariabackup only, not for rsync), or
in one of the server sections. This parameter specifies the number
of binary log files to be sent to the joiner node during SST. This
option is added for compatibility with old SST scripting behavior,
which can be emulated by setting the sst_max_binlogs=1 (although
in general this can cause problems for the reasons described above).
In addition, setting the sst_max_binlogs=0 can be used to suppress
the transmission of binary logs to the joiner nodes during SST
(although sometimes a single file with the current binary log can
still be transmitted to the joiner, even with sst_max_binlogs=0,
because this sometimes necessary in modes that involve the use of
GTIDs with Galera).
Also, this commit ensures correct handling of paths to various
innodb files and directories in the SST scripts, and fixes some
problems with this that existed in mariabackup utility (which
were associated with incorrect handling of the innodb_data_dir
parameter in some scenarios).
In addition, this commit contains the following enhancements:
1) Added tests for mtr, which check the correct work with binlogs
after SST (using rsync and mariabackup);
2) Added correct handling of slashes at the end of all paths that
the SST script receives as parameters;
3) Improved parsing code for --mysqld-args parameters. Now it
correctly processes the sequence "--" after the name of the
one-letter option;
4) Checking the secret signature during joiner authentication
is made independent of presence of bash (as a unix shell)
in the system and diff utility no longer needed to check
certificates compliance;
5) All directories that are necessary for the correct placement
of various logs are automatically created by SST scripts in
advance (before running mariabackup on the joiner node);
6) Removal of old binary logs on joiner is done using the binlog
index (if it exists) (not only by fixed pattern that based
on the current binlog name, as before);
7) Paths for placing binary logs are correctly processed if they
are set as relative paths (to the datadir);
8) SST scripts are made even more resistant to spaces in filenames
(now for binlogs);
9) In case of failure, SST scripts now always end with an exit
code other than zero;
10) SST script for rsync now correctly create a tar file with
the binlogs, even if the paths to them (in the binlog index
file) are specified as a mix of absolute and relative paths,
and even if they do not match with the datadir path specified
in the current configuration settings.
(Edits by SergeiP: fix encryption.tempfiles_encrypted, re-word comment)
Global ORDER BY clause of a UNION may not refer to 1) aggregate functions
or 2) window functions. setup_order() checked for #1 but not for #2.
Backported from 10.5 20e9e804c1 and
5948d7602e.
sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).
It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode to PAGE_CUR_LE for cursor->rel_pos == BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.
But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.
The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.
The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.
Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.
If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.
To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.
The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.
One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.
If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
convert_error_code_to_mysql(): Use the correct limit FK_MAX_CASCADE_DEL
in the error message. The DICT_FK_MAX_RECURSIVE_LOAD applies to
the number of foreign key constraints in table definitions,
not to the number of rows that are visited while processing
a foreign key constraint.
MDEV-22500 Assertion `thd->killed != 0' failed in ha_maria::enable_indexes
For MDEV-17223 the issue was an assert that didn't take into account that
we could get duplicate key errors when enablling unique indexes.
Fixed by not retrying repair in case of duplicate key error for this
case, which avoids the assert.
For MDEV-22500 I removed the assert, as it's not critical (just a way to
find potential wrong code) and we will anyway get things logged in the
error log if this happens. This case cannot triggered an assert in 10.3
but I verified that it would trigger in 10.5 and that this patch fixes
it.
Some GNU/Linux distributions ship a zlib that is modified to use
the s390x DFLTCC instruction. That modification would essentially
redefine compressBound(sourceLen) as (sourceLen * 16 + 2308) / 8 + 6.
Let us relax the tests for InnoDB ROW_FORMAT=COMPRESSED to cope with
such a weaker compression guarantee.
create_table_info_t::row_size_is_acceptable(): Remove a bogus debug-only
assertion that would fail to hold for the test innodb_zip.bug36169.
The function page_zip_empty_size() may indeed return 0.
The rpl.rpl_seconds_behind_master_spike test would sometimes
timeout or take a very long time to complete. This happened
because an MTR DEBUG_SYNC signal would be lost due to a
subsequent call to RESET. I.e., the slave SQL thread would
be paused due to the WAIT_FOR signal being lost, resulting in
either a failed test if the `select master_pos_wait` timeout
occurs first, or a very long run-time if the DBUG_SYNC timeout
occurs first.
The fix ensures that the MTR signal is processed by the slave
SQL thread before issuing the call to RESET
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
row_sel_sec_rec_is_for_clust_rec() treats empty BLOB prefix field in
secondary index as a field equal to any external BLOB field in clustered
index. Row_sel_get_clust_rec_for_mysql::operator() doesn't zerro out
clustered record pointer in row_search_mvcc(), and row_search_mvcc()
thinks that delete-marked secondary index record has visible for
"CHECK TABLE"'s read view old-versioned clustered index record, and
row_scan_index_for_mysql() counts it as a row.
The fix is to execute row_sel_sec_rec_is_for_blob() in
row_sel_sec_rec_is_for_clust_rec() if clustered field contains BLOB's
reference.