Commit graph

23056 commits

Author SHA1 Message Date
Marko Mäkelä
4e1bf2bb23 MDEV-28537 Unused or useless InnoDB counters num_index_pages_written, num_non_index_pages_written
The counters were added in commit 5e55d1ced5
and any code to update them was
inadvertently removed in commit 2e814d4702
when applying InnoDB changes from MySQL 5.7.

Let us remove these counters that never reported anything useful. If such
statistics are really needed in a special case, they can be obtained by
instrumenting the code by some means, such as eBPF or a source code patch.
2022-05-16 13:41:53 +03:00
Nayuta Yanagisawa
8c28b27f00 MDEV-28301 Spider: Fix GCC warnings, comparing the result of pointer addition ... and NULL
The condition of the if statements are always true.
2022-05-13 21:32:49 +09:00
Sergei Golubchik
6f741eb6e4 Merge branch '10.2' into 10.3 2022-05-07 11:48:15 +02:00
Marko Mäkelä
20ae4816bb MDEV-28478: INSERT into SPATIAL INDEX in TEMPORARY table writes log
row_ins_sec_index_entry_low(): If a separate mini-transaction is
needed to adjust the minimum bounding rectangle (MBR) in the parent
page, we must disable redo logging if the table is a temporary table.
For temporary tables, no log is supposed to be written, because
the temporary tablespace will be reinitialized on server restart.

rtr_update_mbr_field(): Plug a memory leak.
2022-05-06 09:30:17 +03:00
Vlad Lesin
2c381d8cf6 MDEV-17843 Assertion `page_rec_is_leaf(rec)' failed in lock_rec_queue_validate upon SHOW ENGINE INNODB STATUS
lock_validate() accumulates page ids under locked lock_sys->mutex, then
releases the latch, and invokes lock_rec_block_validate() for each page.
Some other thread has ability to add/remove locks and change pages
between releasing the latch in lock_validate() and acquiring it in
lock_rec_validate_page().

lock_rec_validate_page() can invoke lock_rec_queue_validate() for
non-locked supremum, what can cause ut_ad(page_rec_is_leaf(rec)) failure
in lock_rec_queue_validate().

The fix is to invoke lock_rec_queue_validate() only for locked records
in lock_rec_validate_page().

The error message in lock_rec_block_validate() is not necessary as
BUF_GET_POSSIBLY_FREED mode is used to get block from buffer pool, and
this is not error if a block was evicted.

The test case would require new debug sync point. I think it's not
necessary as the fixed code is debug-only.
2022-05-04 12:51:28 +03:00
Oleksandr Byelkin
9614fde1aa Merge branch '10.2' into 10.3 2022-05-03 10:59:54 +02:00
Marko Mäkelä
0806592ac8 MDEV-28422 Page split breaks a gap lock
btr_insert_into_right_sibling(): Inherit any gap lock from the
left sibling to the right sibling before inserting the record
to the right sibling and updating the node pointer(s).

lock_update_node_pointer(): Update locks in case a node pointer
will move.

Based on mysql/mysql-server@c7d93c274f
2022-04-27 13:38:08 +03:00
Marko Mäkelä
c711abd182 MDEV-28417 Merge new release of InnoDB 5.7.38 to 10.2 2022-04-27 08:08:52 +03:00
Marko Mäkelä
44a27a26e9 MDEV-28416 Incorrect AUTO_INCREMENT may be issued when close to UINT64_MAX
ha_innobase::get_auto_increment(): In the overflow check, account
for 64-bit unsigned integer wrap-around.

Based on mysql/mysql-server@25ecfe7f49
2022-04-27 08:08:06 +03:00
Marko Mäkelä
f21a875600 MDEV-28415 ALTER TABLE on a large table hangs InnoDB
buf_flush_page(): Never wait for a page latch, even in checkpoint
flushing (flush_type == BUF_FLUSH_LIST), to prevent a hang of the
page cleaner threads when a large number of pages is latched.

In mysql/mysql-server@9542f3015b
it was claimed that such a hang only affects CREATE FULLTEXT INDEX.
Their fix was to retain buffer-fix but release exclusive latch
on non-leaf pages, and subsequently write to those pages while
they are not associated with the mini-transaction, which would
trip a debug assertion in the MariaDB version of
mtr_t::memo_modify_page() and cause potential corruption
when using the default MariaDB setting innodb_log_optimize_ddl=OFF.

This change essentially backports a small part of
commit 7cffb5f6e8 (MDEV-23399)
from MariaDB Server 10.5.7.
2022-04-27 07:57:04 +03:00
Marko Mäkelä
b208030ef5 MDEV-11415 merge fixup: Remove a redundant call
In merge commit 921c5e9314 the call
log_free_check() was accidentally duplicated, causing a small
performance regression on INSERT.
2022-04-25 14:14:02 +03:00
Dmitry Shulga
bc7ba7afee MDEV-27758: Errors when building Connect engine on os x 11.6.2
Added checking for support of vfork by a platform where
building being done. Set HAVE_VFORK macros in case vfork()
system call is supported. Use vfork() system call if the
macros HAVE_VFORK is set, else use fork().
2022-04-22 18:47:19 +07:00
Sergei Golubchik
6f6c74b0d1 Merge branch '10.2' into 10.3 2022-04-21 10:05:50 +02:00
Marko Mäkelä
4730314a70 MDEV-28369 ibuf_bitmap_mutex is an unnecessary contention point
The only purpose of ibuf_bitmap_mutex is to prevent a deadlock between
two concurrent invocations of ibuf_update_free_bits_for_two_pages_low()
on the same pair of bitmap pages, but in opposite order.
The mutex is unnecessarily serializing the execution of the function
even when it is being invoked on totally different tablespaces.
To avoid deadlocks, it suffices to ensure that the two page latches
are being acquired in a deterministic (sorted) order.
2022-04-21 09:15:18 +03:00
Thirunarayanan Balathandayuthapani
372b0e6355 MDEV-20194 Warnings inconsistently issued upon CHECK on table from older versions
The following condition has to added:
1) InnoDB fails to include the offset of the node pointer field
in non-leaf record for redundant row format.

2) If the Fixed length field does have only prefix length then
calculate the field maximum size as prefix length.

- Added the test case to test (2) and to check maximum number of
fields can exist in the index.
2022-04-20 19:55:17 +05:30
Jan Lindström
84b135065c MDEV-28314 : The Galera cluster primary node goes into hang mode when innodb_encryption_threads is enabled
When we enable writes after Galera SST srv_n_fil_crypt_threads needs
to be set temporally to 0 (as was done when writes were disabled)
to make sure that encryption threads will be really started based
on old value of encryption threads.

Fix provided by Marko Mäkelä.
2022-04-20 07:42:21 +03:00
Marko Mäkelä
5aef0123a7 MDEV-28317 Assertion failures in row_undo_mod on recovery
Starting with 10.3, an assertion would fail on the rollback of
a recovered incomplete transaction if a table definition violates
a FOREIGN KEY constraint.

DICT_ERR_IGNORE_RECOVER_LOCK: Include also DICT_ERR_IGNORE_FK_NOKEY
so that trx_resurrect_table_locks() will be able to load
table definitions and resurrect IX locks. Previously, if the
FOREIGN KEY constraints of a table were incomplete, the table
would fail to load until rollback, and in 10.3 or later an assertion
would fail that the rollback was not protected by a table IX lock.

Thanks to commit 9de2e60d74 there
will be no problems to enforce subsequent FOREIGN KEY operations
even though a table with invalid REFERENCES clause was loaded.
2022-04-19 12:40:05 +03:00
Alexander Barkov
9d734cdd61 Merge remote-tracking branch 'origin/10.2' into 10.3 2022-04-14 11:50:34 +04:00
Monty
6891c4874a MDEV-28269 Assertion `save_errno' in maria_write or ER_GET_ERRNO
The issue was that the value of MARIA_FOUND_WRONG_KEY was a value
that could be returned by ha_key_cmp.

This was already fixed in MyISAM, now using the same fix in Aria:
Setting the value to INT_MAX32, which should be impossible in any
normal cases.

I also fixed so that if there is a wrong key, we now get a proper error
message and not an assert.
2022-04-11 17:30:28 +03:00
KiyoshiTakeda
4d1955d348
MDEV-28225 Disallow user to create Spider temporary table
Creating a temporary table with Spider is non-sense because a Spider
table cannot hold any physical data and it requires an additional
effort to manage even if it is configured correctly.

Set HTON_TEMPORARY_NOT_SUPPORTED to spider_hton->flags.  

Reviewed-by: nayuta.yanagisawa@hey.com
Co-authored-by: d8sk4ueun@gmail.com
2022-04-11 23:02:38 +09:00
Nayuta Yanagisawa
4194f7b605 MDEV-25116 Spider: IF(COUNT( trigger SQL Error (1054)_ Unknown column '' in field list
The original query "SELECT IF(COUNT(a.`id`)>=0,'Y','N') FROM t" is
transformed to "SELECT COUNT(a.`id`), IF(ref >= 0, 'Y', 'N') FROM t",
where ref is Item_ref to "COUNT(a.`id`)", by split_sum_func().

Spider walks the item list twice, invoking spider_db_print_item_type().
The first invocation is in spider_create_group_by_handler() with
str == NULL. The second one is in spider_group_by_handler::init_scan()
with str != NULL.

spider_db_print_item_type() prints nothing at the first invocation,
and it prints item at the second invocation. However, at the second
invocation, the above mentioned ref to "COUNT(a.`id`)" points to
a field in a temporary table where the result will be stored. Thus,
to look behind the item_ref, Spider need to generate the query earlier.

A possible fix would be to generate a query to send in
spider_create_group_by_handler(). However, the fix requires a
considerable amount of changes of the Spider's GROUP BY handler.
I'd like to avoid that.

So, I fix the problem by not to use the GROUP BY handler when a
query contains Item_ref whose table_name, name, and alias_name_used
are not set.
2022-04-08 15:27:33 +09:00
Jan Lindström
3c99a48db3 MDEV-28247 : Disable background ibuf merge during Galera SST
This failure was caused by MDEV-25975, which removed the parameter
innodb_disallow_writes.

Added a check for wsrep_sst_disable_writes to the function
ibuf_merge_in_background().
2022-04-07 08:45:01 +03:00
Marko Mäkelä
e9735a8185 MDEV-25975 innodb_disallow_writes causes shutdown to hang
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.

During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:

sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.

sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.

The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.

Reviewed by: Jan Lindström
2022-04-06 08:06:49 +03:00
Marko Mäkelä
7c584d8270 Merge 10.2 into 10.3 2022-04-06 08:06:35 +03:00
Vlad Lesin
c1ab0e6fc6 MDEV-27343 Useless warning "InnoDB: Allocated tablespace ID <id> for <tablename>, old maximum was 0" during backup stage
mariabackup does not load dictionary during backup, but it loads
tablespaces, that is why fil_system.max_assigned_id is not set with
dict_check_tablespaces_and_store_max_id(). There is no sense to issue the
warning during backup.
2022-03-30 19:42:35 +03:00
Marko Mäkelä
35425cfc55 Cleanup: Remove some unused functions 2022-03-30 15:57:08 +03:00
Marko Mäkelä
020e7d89eb Merge 10.2 into 10.3 2022-03-29 09:53:15 +03:00
Marko Mäkelä
303448bc91 MDEV-27931: buf_page_is_corrupted() wrongly claims corruption
In commit 437da7bc54 (MDEV-19534),
the default value of the global variable srv_checksum_algorithm
in innochecksum was changed from SRV_CHECKSUM_ALGORITHM_INNODB
to implied 0 (innodb_checksum_algorithm=crc32). As a result,
the function buf_page_is_corrupted() would by default invoke
buf_calc_page_crc32() in innochecksum, and crc32_inited would hold.

This would cause "innochecksum" to fail on a particular page.

The actual problem is older, introduced in 2011 in
mysql/mysql-server@17e497bdb7
(MySQL 5.6.3). It should affect the validation of pages of old
data files that were written with innodb_checksum_algorithm=innodb.
When using innodb_checksum_algorithm=crc32 (the default setting
since MariaDB Server 10.2), some valid pages would be rejected
only because exactly one of the two checksum fields accidentally
matches the innodb_checksum_algorithm=crc32 value.

buf_page_is_corrupted(): Simplify the logic of non-strict
checksum validation, by always invoking buf_calc_page_crc32().
Remove a bogus condition that if only one of the checksum fields
contains the value returned by buf_calc_page_crc32(), the page
is corrupted.
2022-03-28 13:36:36 +03:00
Monty
74e668eaeb Fixed warning for maria.maria-recovery2 about crashed table
The bug was a missing va_start in eprint() which caused a wrong table
name to be printed.
Patch backported from 10.3.
2022-03-18 13:26:50 +02:00
Marko Mäkelä
118826d173 Fix gcc-12 -O2 -Warray-bounds 2022-03-17 10:20:07 +02:00
Marko Mäkelä
75e39f3cba Fix gcc-12 -O2 -Wmaybe-uninitialized 2022-03-17 10:13:50 +02:00
Marko Mäkelä
0f56e21efa MDEV-28091 PERFORMANCE_SCHEMA unit tests fail due to memory misalignment
Let us make the mocked-up pfs_malloc() return aligned memory, just
like the actual implementation does.
2022-03-16 11:49:47 +02:00
Vlad Lesin
1766a18e06 MDEV-19577 Replication does not work with innodb_autoinc_lock_mode=2
The first step for deprecating innodb_autoinc_lock_mode(see MDEV-27844) is:
- to switch statement binlog format to ROW if binlog format is MIXED and
the statement changes autoincremented fields
- issue warnings if innodb_autoinc_lock_mode == 2 and binlog format is
STATEMENT
2022-03-10 15:38:43 +03:00
Vlad Lesin
86c1bf118a MDEV-27992 DELETE fails to delete record after blocking is released
MDEV-27025 allows to insert records before the record on which DELETE is
locked, as a result the DELETE misses those records, what causes serious ACID
violation.

Revert MDEV-27025, MDEV-27550. The test which shows the scenario of ACID
violation is added.
2022-03-07 16:42:05 +03:00
Marko Mäkelä
02da00a98c Merge 10.2 into 10.3 2022-03-04 14:29:36 +02:00
Marko Mäkelä
3c06a0b7dc MDEV-28004 ha_innobase::reset_auto_increment() is never executed
The virtual member function handler::reset_auto_increment(ulonglong)
is only ever invoked by the default implementation of the virtual
member function handler::truncate().

Because ha_innobase::truncate() overrides handler::truncate() without
ever invoking handler::truncate(), some InnoDB member functions are
never called.

ha_innobase::innobase_reset_autoinc(), ha_innobase::reset_auto_increment():
Removed (unreachable code).

ha_innobase::delete_all_rows(): Removed. The default implementation
handler::delete_all_rows() works just as fine.
2022-03-04 14:23:33 +02:00
Thirunarayanan Balathandayuthapani
1248fe7277 MDEV-27582 Fulltext DDL decrements the FTS_DOC_ID value
- InnoDB FTS DDL decrements the FTS_DOC_ID when there is a
deleted marked record involved. FTS_DOC_ID must never be
reused. The purpose of FTS_DOC_ID is to be a unique row
identifier that will be changed whenever a fulltext indexed
column is updated.
2022-03-03 19:03:31 +05:30
Marko Mäkelä
a92f07f4bd MDEV-27993 Assertion failed in btr_page_reorganize_low()
btr_cur_optimistic_insert(): Disregard DEBUG_DBUG injection to
invoke btr_page_reorganize() if the page (and the table) is empty.
Otherwise, an assertion would fail in btr_page_reorganize_low()
because PAGE_MAX_TRX_ID is 0 in an empty secondary index leaf page.
2022-03-03 11:51:25 +02:00
Marko Mäkelä
0635088deb MDEV-27800: Avoid garbage TRX_UNDO_TRX_NO on TRX_UNDO_CACHED pages
In commit c7d0448797 (MDEV-15132)
MariaDB Server 10.3 stopped writing the latest transaction identifier
to the TRX_SYS page. Instead, the transaction identifier will be
recovered from undo log pages.

Unfortunately, before commit 3926673ce7
and mysql/mysql-server@dc29792ff2
(MySQL 5.1.48 or MariaDB 5.1.48) InnoDB did not always initialize all
data fields, but some garbage could be left behind in unused parts
of data pages.

In undo log pages that are essentially free, but added to a list for
reuse (TRX_UNDO_CACHED) the TRX_UNDO_TRX_NO fields could contain garbage,
instead of 0. As long as such undo pages are being reused and never
marked completely free, the garbage contents may remain forever.
In fact, the function trx_undo_header_create() and the record
MLOG_UNDO_HDR_CREATE will only initialize TRX_UNDO_TRX_ID, but leave
TRX_UNDO_TRX_NO uninitialized.

trx_undo_mem_create_at_db_start(): Only read the TRX_UNDO_TRX_NO
fields of TRX_UNDO_CACHED pages if the TRX_UNDO_PAGE_TYPE is 0,
that is, the page was updated by MariaDB Server 10.3. Earlier versions
would always write the TRX_UNDO_PAGE_TYPE as 1 or 2.

trx_undo_header_create(): Zero out the TRX_UNDO_TRX_NO field.
Strictly speaking, this will change the semantics of the
MLOG_UNDO_HDR_CREATE record, but it should not do any harm to
overwrite a potentially garbage field with zeroes.

Note: This fix will only help future upgrades straight from
MariaDB Server 10.2 or MySQL 5.6 or earlier. If such an upgrade has
already been made, then an earlier server startup could have
fast-forwarded the transaction ID sequence to a large value.
If this large value cannot be represented in 48 bits (the size of
the DB_TRX_ID column in clustered index records), then various
strange things can happen.
2022-02-28 12:12:12 +02:00
Marko Mäkelä
9ba385a50d Merge 10.2 into 10.3 2022-02-25 12:40:26 +02:00
Marko Mäkelä
ed691eca99 Remove deprecated (in C++11) std::binary_function 2022-02-25 12:34:06 +02:00
Marko Mäkelä
00b70bbb51 Merge 10.2 into 10.3 2022-02-25 10:43:38 +02:00
Thirunarayanan Balathandayuthapani
a76731e1a1 MDEV-27913 innodb_ft_cache_size max possible value (80000000) is too small for practical purposes
- Make innodb_ft_cache_size & innodb_ft_total_cache_size are dynamic
variable and increase the maximum value of innodb_ft_cache_size to
512MB for 32-bit system and 1 TB for 64-bit system and set
innodb_ft_total_cache_size maximum value to 1 TB for 64-bit system.

- Print warning if the fts cache exceeds the innodb_ft_cache_size
and also unlock the cache if fts cache memory reduces less than
innodb_ft_cache_size.
2022-02-24 22:41:23 +05:30
Marko Mäkelä
8b7abe21e0 MDEV-23888 fixup: GCC 12 -Wunused-value 2022-02-23 06:00:01 +02:00
Vlad Lesin
a6f258e47f MDEV-20605 Awaken transaction can miss inserted by other transaction records due to wrong persistent cursor restoration
Backported from 10.5 20e9e804c1 and
5948d7602e.

sel_restore_position_for_mysql() moves forward persistent cursor
position after btr_pcur_restore_position() call if cursor relative position
is BTR_PCUR_ON and the cursor points to the record with NOT the same field
values as in a stored record(and some other not important for this case
conditions).

It was done because btr_pcur_restore_position() sets
page_cur_mode_t mode  to PAGE_CUR_LE for cursor->rel_pos ==  BTR_PCUR_ON
before opening cursor. So we are searching for the record less or equal
to stored one. And if the found record is not equal to stored one, then
it is less and we need to move cursor forward.

But there can be a situation when the stored record was purged, but the
new one with the same key but different value was inserted while
row_search_mvcc() was suspended. In this case, when the thread is
awaken, it will invoke sel_restore_position_for_mysql(), which, in turns,
invoke btr_pcur_restore_position(), which will return false because found
record don't match stored record, and
sel_restore_position_for_mysql() will move forward cursor position.

The above can lead to the case when awaken row_search_mvcc() do not see
records inserted by other transactions while it slept. The mtr test case
shows the example how it can be.

The fix is to return special value from persistent cursor restoring
function which would notify its caller that uniq fields of restored
record and stored record are the same, and in this case
sel_restore_position_for_mysql() don't move cursor forward.

Delete-marked records are correctly processed in row_search_mvcc().
Non-unique secondary indexes are "uniquified" by adding the PK, the
index->n_uniq should then be index->n_fields. So there is no need in
additional checks in the fix.

If transaction's readview can't see the changes made in secondary index
record, it requests clustered index record in row_search_mvcc() to check
its transaction id and get the correspondent record version. After this
row_search_mvcc() commits mtr to preserve clustered index latching
order, and starts mtr. Between those mtr commit and start secondary
index pages are unlatched, and purge has the ability to remove stored in
the cursor record, what causes rows duplication in result set for
non-locking reads, as cursor position is restored to the previously
visited record.

To solve this the changes are just switched off for non-locking reads,
it's quite simple solution, besides the changes don't make sense for
non-locking reads.

The more complex and effective from performance perspective solution is
to create mtr savepoint before clustered record requesting and rolling
back to that savepoint after that. See MDEV-27557.

One more solution is to have per-record transaction id for secondary
indexes. See MDEV-17598.

If any of those is implemented, just remove select_lock_type argument in
sel_restore_position_for_mysql().
2022-02-21 12:49:54 +03:00
Vlad Lesin
5f001bd7b8 MDEV-27025 insert-intention lock conflicts with waiting ORDINARY lock
The code was backported from 10.5 be8113861c
commit. See that commit message for details.
2022-02-21 12:49:54 +03:00
Vladislav Vaintroub
24ec144c63 MDEV-27901 Windows : expensive system calls used to calculate file system block size
The result is not used anywhere but in the output of Innodb information
schema, but this can take as much as 7%CPU (only) on a benchmark.

Fix to move fs blocksize calculate to where it is used.
2022-02-20 22:00:42 +01:00
Nayuta Yanagisawa
66f55a018b MDEV-27730 Add PLUGIN_VAR_DEPRECATED flag to plugin variables
The sys_var class has the deprecation_substitute member to mark the
deprecated variables. As it's set, the server produces warnings when
these variables are used. However, the plugin has no means to utilize
that functionality.

So, the PLUGIN_VAR_DEPRECATED flag is introduced to set the
deprecation_substitute with the empty string. A non-empty string can
make the warning more informative, but there's no nice way seen to
specify it, and not that needed at the moment.
2022-02-18 13:10:20 +09:00
Marko Mäkelä
5b237e5965 Merge 10.2 into 10.3 2022-02-17 10:53:58 +02:00
Marko Mäkelä
73c391afc5 MDEV-27583 InnoDB uses different constants for FK cascade error message in SQL vs error log
convert_error_code_to_mysql(): Use the correct limit FK_MAX_CASCADE_DEL
in the error message. The DICT_FK_MAX_RECURSIVE_LOAD applies to
the number of foreign key constraints in table definitions,
not to the number of rows that are visited while processing
a foreign key constraint.
2022-02-17 10:48:24 +02:00