Tests with 4096-byte sector size confirm that it is
safe to use O_DIRECT with page_compressed tables.
That had been disabled on Linux, in an attempt to fix MDEV-21584
which had been filed for the O_DIRECT problems earlier.
The fil_node_t::block_size was being set mostly correctly until
commit 10dd290b4b (MDEV-17380)
introduced a regression in MariaDB Server 10.4.4.
fil_node_open_file(): Only avoid setting O_DIRECT on
ROW_FORMAT=COMPRESSED tables that use KEY_BLOCK_SIZE=1 or 2
(1024 or 2048 bytes).
fil_ibd_create(): Avoid setting O_DIRECT on ROW_FORMAT=COMPRESSED tables
that use KEY_BLOCK_SIZE=1 or 2 (1024 or 2048 bytes).
fil_node_t::find_metadata(): Require fstat() to be always invoked
outside Microsoft Windows, so that fil_node_t::block_size can be set.
fil_node_t::read_page0(): Rely on find_metadata() to assign block_size.
Thanks to Vladislav Vaintroub for testing this on Microsoft Windows
using an old-fashioned rotational hard disk with 4KiB sector size.
Reviewed by: Vladislav Vaintroub
This is a port of commit 00f620b27e
and commit 6505662c23 from 10.2.
fil_op_replay_rename(): Remove.
fil_rename_tablespace_check(): Remove a parameter is_discarded=false.
recv_sys_t::parse(): Instead of applying FILE_RENAME operations,
buffer the operations in renamed_spaces.
recv_sys_t::apply(): In the last_batch, apply renamed_spaces.
Online log for insert operation of redundant table fails with
index->is_instant() assert. Purge can reset the n_core_fields when
alter is waiting to upgrade MDL for commit phase of DDL. In the
meantime, any insert DML tries to log the operation fails with
index is not being instant.
row_log_get_n_core_fields(): Get the n_core_fields of online log
for the given index.
rec_get_converted_size_comp_prefix_low(): Use n_core_fields of online
log when InnoDB calculates the size of data tuple during redundant
row format table rebuild.
rec_convert_dtuple_to_rec_comp(): Use n_core_fields of online log
when InnoDB does the conversion of data tuple to record during
redudant row format table rebuild.
- Adding the test case which has more than 129 instant columns.
- This is caused by merge commit a26e7a3726.
InnoDB fails to fetch the next index field when there is a externally
stored column length check involved.
In commit 118e258aaa (part of MDEV-23855)
we inadvertently broke crash recovery, reintroducing MDEV-11556.
fil_system_t::extend_to_recv_size(): Extend all open tablespace files
to the recovered size.
recv_sys_t::apply(): Invoke fil_system.extend_to_recv_size() at the
start of each batch. In this way, any fil_space_t::recv_size
changes that were parsed after the file was opened will be applied.
page_apply_insert_redundant(): Replace a too strict condition
hdr_c > pextra_size. It turns out that page_cur_insert_rec_low()
is not even computing the extra_size of cur->rec when it is trying
to reuse header bytes of the preceding record.
MDEV-25105 (commit 7a4fbb55b0)
in MariaDB 10.6 will refuse the innodb_checksum_algorithm
values none, innodb, strict_none, strict_innodb.
We will issue a deprecation warning if innodb_checksum_algorithm
is set to any of these non-default unsafe values.
innodb_checksum_algorithm=crc32 was made the default in
MySQL 5.7 and MariaDB Server 10.2, and given that older versions
of the server have reached their end of life, there is no valid
reason to use anything else than innodb_checksum_algorithm=crc32
or innodb_checksum_algorithm=strict_crc32 in MariaDB 10.3.
Reviewed by: Sergei Golubchik
- Currently page cleaner thread will stop flushing if
dirty_pct < innodb_max_dirty_pages_pct_lwm.
- If the server is not performing any activity then said resources/time
could be used to flush the pending dirty pages and keep buffer pool
clean for the next burst of the cycle. This flushing is called idle flushing.
- flushing logic underwent a complete revamp in 10.5.7/8
and as part of the revamp idle flushing logic got removed.
- New proposed logic of idle flushing is based on updated logic of the
page cleaner that will enable idle flushing if
- buf page cleaner is idle
- there are dirty pages (< innodb_max_dirty_pages_pct_lwm)
- server is not performing any activity
Logic will kickstart the idle flushing bounded by innodb_io_capacity.
(Thanks to Marko Makela for reviewing the patch and idea
right from the its inception).
The problem was that the CONNECT engine is trying to open the .frm file
during drop_table(), which the code did not take into account.
Fixed by adding the HA_REUSES_FILE_NAMES table flag to CONNECT.
Other things:
- Fixed a wrong test of HA_REUSE_FILE_NAMES of in mysql_alter_table()
(Comment was correct, no the code)
- Added a test in the connect engine that if the .frm it tries to use in
delete is not made for connect, it will generate an error instead of
crash.
InnoDB set the space in dict_table_t as NULL when table
is discarded. So InnoDB shouldn't use the space present
in table to detect whether the given tablespace is
temporary tablespace.
failed in dtuple_convert_big_rec
In dtuple_convert_big_rec(), InnoDB fails to consider the
instant metadata blob while choosing the variable length
field.
row_prebuilt_t::m_no_prefetch: Remove (it was always false).
row_prebuilt_t::m_read_virtual_key: Remove (it was always false).
Only ha_innopart ever set these fields.
innobase_rename_table(): Invoke dict_stats_wait_bg_to_stop_using_table()
to ensure that dict_stats_update() cannot be accessing the table name
that we will be modifying. If we are executing RENAME rather than TRUNCATE,
reset the flag at the end so that persistent statistics can be calculated
again.
The race condition was encountered with ASAN and rr.
Sorry, there is no test case, like there is for nothing related to
dict_stats_wait_bg_to_stop_using_table(). The entire code is an ugly
work-around for the failure of dict_stats_process_entry_from_recalc_pool()
to acquire MDL.
Note: It appears that an ALTER TABLE that is not rebuilding the table
will fail to reset the flag that blocks the processing of statistics.
In btr_index_rec_validate(), externally stored column
check is missing while matching the length of the field
with the length of the field data stored in record.
Fetch the length of the externally stored part and compare it
with the fixed field length.
- This issue is caused by commit deadec4e68
(MDEV-24569). InnoDB fails to read the change buffer bitmap page
from dropped tablespace. In ibuf_bitmap_get_map_page_func(), InnoDB
should fetch the page using BUF_GET_POSSIBLY_FREED mode. Callers of
ibuf_bitmap_get_map_page() should be adjusted in that case.
This is after-merge fix of f33e57a9e6.
In btr_search_drop_page_hash_index(), InnoDB should take
the exclusive lock on the AHI latch if index is already
freed to avoid the freed memory access during buf_pool_resize()
This is a backport of commit 18535a4028
from 10.6.
lock_release(): Implement innodb_evict_tables_on_commit_debug.
Before releasing any locks, collect the identifiers of tables to
be evicted. After releasing all locks, look up for the tables and
evict them if it is safe to do so.
trx_commit_in_memory(): Invoke trx_update_mod_tables_timestamp()
before lock_release(), so that our locks will protect the tables
from being evicted.
When doing a truncate on an Innodb under lock tables, InnoDB would rename
the old table to #sql-... and recreate a new 't1' table. The table lock
would still be on the #sql-table.
When doing ALTER TABLE, Innodb would do the changes on the #sql table
(which would disappear on close).
When the SQL layer, as part of inline alter table, would close the
original t1 table (#sql in InnoDB) and then reopen the t1 table, Innodb
would notice that this does not match it's own (old) t1 table and
generate an error.
Fixed by adding code in truncate table that if we are under lock tables
and truncating an InnoDB table, we would close, reopen and lock the
table after truncate. This will remove the #sql table and ensure that
lock tables is using the new empty table.
Reviewer: Marko Mäkelä
The test case encryption.innodb_encrypt_freed was failing in
MemorySanitizer builds.
recv_recover_page(): Mark non-recovered pages as freed.
fil_crypt_rotate_page(): Before comparing the block->frame contents,
check if the block was marked as freed.
Other places: Whenever using BUF_GET_POSSIBLY_FREED, check the
block->page.status before accessing the page frame.
(Both uses of BUF_GET_IF_IN_POOL should be correct now.)
eprintf() was missing a va_start(), which caused wrong filename to be
printed when printing recovery trace.
Added also missing new line when printing "Table is crashed" to trace file
- This is caused by commit deadec4e68
(MDEV-24569). InnoDB fails to set the tablespace associated with
mini-transacton while resetting the change buffer bitmap bits of
the page.
- The commit 5fd3c7471e3e0673b50d309567c9747d36f09412(MDEV-24709)
resets the recv_no_ibuf_operations in
recv_recovery_from_checkpoint_start(), but InnoDB fails to reset
the variable recv_no_log_write() during that time and that leads
to the assert failure.