Commit graph

3189 commits

Author SHA1 Message Date
Vladislav Vaintroub
4e667b9638 MDEV-27877 considerably speed up innodb_force_recovery test
decrease innodb_lock_wait_timeout for the current session.
2022-02-17 18:25:32 +01:00
Marko Mäkelä
cac995ec6f Merge 10.4 into 10.5 2022-02-17 11:58:25 +02:00
Marko Mäkelä
6f4740fde7 MDEV-27723 innodb.instant_alter,8k.rdiff does not apply on FreeBSD
The .rdiff files were not correctly adjusted in
commit 8f4a3bf07c (MDEV-25057).
2022-02-17 11:49:13 +02:00
Marko Mäkelä
f921db7aa5 Merge 10.3 into 10.4 2022-02-17 11:33:08 +02:00
Marko Mäkelä
5b237e5965 Merge 10.2 into 10.3 2022-02-17 10:53:58 +02:00
Marko Mäkelä
73c391afc5 MDEV-27583 InnoDB uses different constants for FK cascade error message in SQL vs error log
convert_error_code_to_mysql(): Use the correct limit FK_MAX_CASCADE_DEL
in the error message. The DICT_FK_MAX_RECURSIVE_LOAD applies to
the number of foreign key constraints in table definitions,
not to the number of rows that are visited while processing
a foreign key constraint.
2022-02-17 10:48:24 +02:00
Marko Mäkelä
cf574cf53b MDEV-27634 innodb_zip tests failing on s390x
Some GNU/Linux distributions ship a zlib that is modified to use
the s390x DFLTCC instruction. That modification would essentially
redefine compressBound(sourceLen) as (sourceLen * 16 + 2308) / 8 + 6.

Let us relax the tests for InnoDB ROW_FORMAT=COMPRESSED to cope with
such a weaker compression guarantee.

create_table_info_t::row_size_is_acceptable(): Remove a bogus debug-only
assertion that would fail to hold for the test innodb_zip.bug36169.
The function page_zip_empty_size() may indeed return 0.
2022-02-16 17:03:02 +02:00
Vlad Lesin
5948d7602e MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
Post-push fix: remove unstable test.

The test was developed to find the reason of duplicated rows caused by
MDEV-20605 fix. The test is not necessary as the reason was found and
the bug was fixed.
2022-02-15 10:04:05 +03:00
Vlad Lesin
20e9e804c1 MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).

It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode  to PAGE_CUR_LE for cursor->rel_pos ==  BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.

But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.

The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.

The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.

Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.

If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.

To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.

The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.

One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.

If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
2022-02-14 17:35:04 +03:00
Marko Mäkelä
52b32c60c2 Merge 10.4 into 10.5 2022-02-14 08:59:33 +02:00
Marko Mäkelä
c9bc10e6e8 Merge 10.3 into 10.4 2022-02-14 08:56:50 +02:00
Marko Mäkelä
e928fdbff1 Merge 10.2 into 10.3 2022-02-14 08:49:11 +02:00
Vlad Lesin
3b10e8f80c MDEV-27746 Wrong comparision of BLOB's empty preffix with non-preffixed BLOB causes rows count mismatch for clustered and secondary indexes during non-locking read
row_sel_sec_rec_is_for_clust_rec() treats empty BLOB prefix field in
secondary index as a field equal to any external BLOB field in clustered
index. Row_sel_get_clust_rec_for_mysql::operator() doesn't zerro out
clustered record pointer in row_search_mvcc(), and row_search_mvcc()
thinks that delete-marked secondary index record has visible for
"CHECK TABLE"'s read view old-versioned clustered index record, and
row_scan_index_for_mysql() counts it as a row.

The fix is to execute row_sel_sec_rec_is_for_blob() in
row_sel_sec_rec_is_for_clust_rec() if clustered field contains BLOB's
reference.
2022-02-11 12:26:27 +03:00
Oleksandr Byelkin
34c5019698 Merge branch '10.5' into bb-10.5-release 2022-02-09 08:57:41 +01:00
Marko Mäkelä
5c46751f23 MDEV-27734 Set innodb_change_buffering=none by default
The aim of the InnoDB change buffer is to avoid delays when a leaf page
of a secondary index is not present in the buffer pool, and a record needs
to be inserted, delete-marked, or purged. Instead of reading the page into
the buffer pool for making such a modification, we may insert a record to
the change buffer (a special index tree in the InnoDB system tablespace).
The buffered changes are guaranteed to be merged if the index page
actually needs to be read later.

The change buffer could be useful when the database is stored on a
rotational medium (hard disk) where random seeks are slower than
sequential reads or writes.

Obviously, the change buffer will cause write amplification, due to
potentially large amount of metadata that is being written to the
change buffer. We will have to write redo log records for modifying
the change buffer tree as well as the user tablespace. Furthermore,
in the user tablespace, we must maintain a change buffer bitmap page
that uses 2 bits for estimating the amount of free space in pages,
and 1 bit to specify whether buffered changes exist. This bitmap needs
to be updated on every operation, which could reduce performance.

Even if the change buffer were free of bugs such as MDEV-24449
(potentially causing the corruption of any page in the system tablespace)
or MDEV-26977 (corruption of secondary indexes due to a currently
unknown reason), it will make diagnosis of other data corruption harder.

Because of all this, it is best to disable the change buffer by default.
2022-02-09 08:36:41 +02:00
Monty
0ec27d7b1f Don't run innodb_defgragment under valgrind (too slow) 2022-02-08 14:32:28 +02:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Marko Mäkelä
e9aac09153 MDEV-25440: Indexed CHAR columns are broken with NO_PAD collations
cmp_data(): Compare different-length CHAR fields with
the new strnncollsp_nchars function that will pad spaces if needed.

Any InnoDB ROW_FORMAT except the original one that was named
ROW_FORMAT=REDUNDANT in MySQL 5.0.3 will internally store
CHAR(n) columns as variable-length if the character encoding is
variable length. Spaces may be trimmed from the end.
For NOT NULL values, the minimum length is always n*mbminlen.
In cmp_data() we only know the lengths in bytes and we cannot
easily know the ROW_FORMAT.

is_strnncoll_compatible(): Refactored from innobase_mysql_cmp().

innobase_mysql_cmp(): Merged to cmp_whole_field().

cmp_whole_field(): Invoke strnncollsp_nchars for the DATA_MYSQL
(the CHAR type with any other collation than latin1_swedish_ci).

Reviewed by: Alexander Barkov
Tested by: Roel Roel Van de Paar
2022-01-26 12:42:17 +02:00
Alexander Barkov
62e320c86d MDEV-18918 SQL mode EMPTY_STRING_IS_NULL breaks RBR upon CREATE TABLE .. SELECT
The 10.5 version of the patch.

Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.

Also fixes:
 MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
2022-01-25 10:31:55 +04:00
Alexander Barkov
da37bfd8d6 MDEV-18918 SQL mode EMPTY_STRING_IS_NULL breaks RBR upon CREATE TABLE .. SELECT
Removing DEFAULT from INFORMATION_SCHEMA columns.
DEFAULT in read-only tables is rather meaningless.
Upgrade should go smoothly.

Also fixes:
 MDEV-20254 Problems with EMPTY_STRING_IS_NULL and I_S tables
2022-01-25 10:31:03 +04:00
Eugene Kosov
faaecc8fcf MDEV-27273 Confusing column count in IMPORT TABLESPACE error message
It's misleading to compare and write to user number of columns and fields.
Thus, it would be better to remove that check and let use see a subsequent
error message about missing or mispaced column.

row_import::match_schema(): remove misleading check
2022-01-21 20:25:56 +03:00
Marko Mäkelä
c1d7b4575e MDEV-26870 --skip-symbolic-links does not disallow .isl file creation
The InnoDB DATA DIRECTORY attribute is not implemented via
symbolic links but something similar, *.isl files that contain
the names of data files.

InnoDB failed to ignore the DATA DIRECTORY attribute even though
the server was started with --skip-symbolic-links.

Native ALTER TABLE in InnoDB will retain the DATA DIRECTORY attribute
of the table, no matter if the table will be rebuilt or not.

Generic ALTER TABLE (with ALGORITHM=COPY) as well as TRUNCATE TABLE
will discard the DATA DIRECTORY attribute.

All tests have been run with and without the ./mtr option
--mysqld=--skip-symbolic-links
and some tests that use the InnoDB DATA DIRECTORY attribute
have been adjusted for this.
2022-01-21 14:43:59 +02:00
Thirunarayanan Balathandayuthapani
28e166d643 MDEV-26784 [Warning] InnoDB: Difficult to find free blocks in the buffer pool
Problem:
=======
  InnoDB ran out of memory during recovery and it fails to
flush the dirty LRU blocks. The reason is that buffer pool
can ran out before the LRU list length reaches
BUF_LRU_OLD_MIN_LEN(256) threshold.

Fix:
====
During recovery, InnoDB should write out and evict all
dirty blocks.
2022-01-21 14:15:18 +05:30
Daniel Black
410c4edef3 MDEV-27467: innodb to enforce the minimum innodb_buffer_pool_size in SET GLOBAL
.. to be the same as startup.

In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.

The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.

The resulting minimum innodb buffer pool sizes are:

Page Size, Previously minimum (startup), with change.
        4k                            5M           2M
        8k                            5M           3M
       16k                            5M           5M
       32k                           24M          10M
       64k                           24M          20M

With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.

The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.

Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
  minimums. Chunk size also needed increase as the test was for
  pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
  MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
  innodb.restart.opt test so that under debug mode, resizing of the
  innodb buffer pool can occur.
2022-01-19 11:10:45 +11:00
Eugene Kosov
e128d852e8 MDEV-27272 Crash on EXPORT/IMPORT tablespace with column added in the middle
dict_index_t::reconstruct_fields(): add input validation by replacing some
assertions

handle_instant_metadata(): fix nullptr dereference
2022-01-19 00:04:57 +03:00
Vlad Lesin
be8113861c MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock
The code was backported from 10.6 bd03c0e516
commit. See that commit message for details.

Apart from the above commit trx_lock_t::wait_trx was also backported from
MDEV-24738. trx_lock_t::wait_trx is protected with lock_sys.wait_mutex
in 10.6, but that mutex was implemented only in MDEV-24789. As there is no
need to backport MDEV-24789 for MDEV-27025,
trx_lock_t::wait_trx is protected with the same mutexes as
trx_lock_t::wait_lock.

This fix should not break innodb-lock-schedule-algorithm=VATS. This
algorithm uses an Eldest-Transaction-First (ETF) heuristic, which prefers
older transactions over new ones. In this fix we just insert granted lock
just before the last granted lock of the same transaction, what does not
change transactions execution order.

The changes in lock_rec_create_low() should not break Galera Cluster,
there is a big "if" branch for WSREP. This branch is necessary to provide
the correct transactions execution order, and should not be changed for
the current bug fix.
2022-01-18 18:15:10 +03:00
Marko Mäkelä
d7f4fd30f2 MDEV-8851 innodb.innodb_information_schema fails sporadically
The column INFORMATION_SCHEMA.INNODB_LOCKS.LOCK_DATA
would report NULL when the page that contains the locked
record does not reside in the buffer pool.

Pages may be evicted from the buffer pool due to some background
activity, such as the purge of transaction history loading
undo log pages to the buffer pool. The regression tests intentionally
run with a small buffer pool size setting.

To prevent the intermittent test failures, we will filter out the
contents of the LOCK_DATA column from the output.
2022-01-14 15:53:29 +02:00
Aleksey Midenkov
6831b3f2a0 MDEV-26824 Can't add foreign key with empty referenced columns list
create_table_info_t::create_foreign_keys() expects equal number of
iterations through fk->columns and fk->ref_columns. If fk->ref_columns
is empty copy it from fk->columns.
2022-01-12 17:18:38 +03:00
Julius Goryavsky
55bb933a88 Merge branch 10.4 into 10.5 2021-12-26 12:51:04 +01:00
Julius Goryavsky
681b7784b6 Merge branch 10.3 into 10.4 2021-12-25 12:13:03 +01:00
Julius Goryavsky
3376668ca8 Merge branch 10.2 into 10.3 2021-12-23 14:14:04 +01:00
Marko Mäkelä
3b33593f80 MDEV-27332 SIGSEGV in fetch_data_into_cache()
Since commit fb335b48b5 we may have
a null pointer in purge_sys.query when fetch_data_into_cache() is
invoked and innodb_force_recovery>4. This is because the call to
purge_sys.create() would be skipped.

fetch_data_into_cache(): Load the purge_sys pseudo transaction pointer
to a local variable (null pointer if purge_sys is not initialized).
2021-12-21 11:07:25 +02:00
Marko Mäkelä
ef9517eb81 MDEV-27268 Failed InnoDB initialization leaves garbage files behind
create_log_files(): Check log_set_capacity() before modifying
or creating any log files.

innobase_start_or_create_for_mysql(): If create_log_files()
fails and we were initializing a new database, delete the
system tablespace files before exiting.
2021-12-15 14:17:55 +02:00
Marko Mäkelä
6b066ec332 MDEV-27235: Crash on SET GLOBAL innodb_encrypt_tables
fil_crypt_set_encrypt_tables(): If no encryption threads have been
initialized, do nothing.
2021-12-13 08:04:45 +02:00
Marko Mäkelä
5489ce0ae1 Merge 10.4 into 10.5 2021-11-17 14:49:12 +02:00
Marko Mäkelä
878f7e38c7 MDEV-23805 fixup: Adjsut the MDEV-16131 and MDEV-24730 tests
MDEV-23805 simplified the treatment of empty tables during ALTER TABLE,
which could prevent the scenarios that were previously reported and
fixed as MDEV-16131 and MDEV-24730.

With the MDEV-23805 fix, the statement
SET DEBUG_SYNC = 'now WAIT_FOR copied';
could occasionally time out, depending on timing.
Apparently, there was a race condition where purge could resume
(and empty the table) before ALTER TABLE got the chance to execute.
We must prevent the purge of history from running before
ALTER TABLE has started executing.
2021-11-17 13:40:21 +02:00
Marko Mäkelä
09205a1c9a Merge 10.4 into 10.5 2021-11-16 14:26:13 +02:00
Thirunarayanan Balathandayuthapani
d270525dfd MDEV-23805 Make Online DDL to Instant DDL when table is empty
- In ha_innobase::prepare_inplace_alter_table(), InnoDB should
check whether the table is empty. If the table is empty then
server should avoid downgrading the MDL after prepare phase.
It is more like instant alter, does change only in dicationary
and metadata.

- Changed few debug test case to make non-empty DDL table
2021-11-12 17:46:35 +05:30
Marko Mäkelä
9c18b96603 Merge 10.4 into 10.5 2021-11-09 08:50:33 +02:00
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Alexander Barkov
059797ed44 MDEV-24901 SIGSEGV in fts_get_table_name, SIGSEGV in ib_vector_size, SIGSEGV in row_merge_fts_doc_tokenize, stack smashing
strmake() puts one extra 0x00 byte at the end of the string.
The code in my_strnxfrm_tis620[_nopad] did not take this into
account, so in the reported scenario the 0x00 byte was put outside
of a stack variable, which made ASAN crash.

This problem is already fixed in in MySQL:

  commit 19bd66fe43c41f0bde5f36bc6b455a46693069fb
  Author: bin.x.su@oracle.com <>
  Date:   Fri Apr 4 11:35:27 2014 +0800

But the fix does not seem to be correct, as it breaks when finds a zero byte
in the source string.

Using memcpy() instead of strmake().

- Unlike strmake(), memcpy() it does not write beyond the destination
  size passed.
- Unlike the MySQL fix, memcpy() does not break on the first 0x00 byte found
  in the source string.
2021-10-29 12:37:29 +04:00
Marko Mäkelä
44f9736e0b Merge 10.4 into 10.5 2021-10-27 09:48:22 +03:00
Eugene Kosov
d74d95961a MDEV-18543 IMPORT TABLESPACE fails after instant DROP COLUMN
ALTER TABLE IMPORT doesn't properly handle instant alter metadata.
This patch makes IMPORT read, parse and apply instant alter metadata at the
very beginning of operation. So, cases when source table has some metadata
and destination table doesn't have it now works fine.

DISCARD already removes instant metadata so importing normal table into
instant table worked fine before this patch.

decrypt_decompress(): decrypts and decompresses page if needed

handle_instant_metadata(): this should be the first thing to read source
table. Basically, it applies instant metadata to a destination
dict_table_t object. This is the first thing to read FSP flags so
all possible checks of it were moved to this function.

PageConverter::update_index_page(): it doesn't now read instant metadata.
This logic were moved into handle_instant_metadata()

row_import::match_flags(): this is a first part row_import::match_schema().
As a separate function it's used by handle_instant_metadata().

fil_space_t::is_full_crc32_compressed(): added convenient function

ha_innobase::discard_or_import_tablespace(): do not reload table definition
to read instant metadata because handle_instant_metadata() does it better.
The reverted code was originally added in
4e7ee166a9

ANONYMOUS_VAR: this is a handy thing to use along with make_scope_exit()

full_crc32_import.test shows different results, because no
dict_table_close() and dict_table_open_on_id() happens.
Thus, SHOW CREATE TABLE shows a little bit older table definition.
2021-10-26 22:50:58 +06:00
Marko Mäkelä
5f8561a6bc Merge 10.4 into 10.5 2021-10-21 15:26:25 +03:00
Marko Mäkelä
489ef007be Merge 10.3 into 10.4 2021-10-21 14:57:00 +03:00
Marko Mäkelä
e4a7c15dd6 Merge 10.2 into 10.3 2021-10-21 13:41:04 +03:00
Marko Mäkelä
05c3dced86 MDEV-22627 fixup: Cover also ALTER TABLE...ALGORITHM=INPLACE 2021-10-20 22:16:23 +03:00