Problem was that crypt_data->min_key_version is not a reliable way
to detect is tablespace encrypted and could lead that in first page
of the second (page 192 and similarly for other files if more configured)
system tablespace file used key_version is replaced with zero leading
a corruption as in next startup page is though to be corrupted.
Note that crypt_data->min_key_version is updated only after all
pages from tablespace have been processed (i.e. key rotation is done)
and flushed.
fil_write_flushed_lsn
Use crypt_data->should_encrypt() instead.
The InnoDB background DROP TABLE queue is something that we should
really remove, but are unable to until we remove dict_operation_lock
so that DDL and DML operations can be combined in a single transaction.
Because the queue is not persistent, it is not crash-safe. In stable
versions of MariaDB, we can only try harder to drop all enqueued
tables before server shutdown.
row_mysql_drop_t::table_id: Replaces table_name.
row_drop_tables_for_mysql_in_background():
Do not remove the entry from the list as long as the table exists.
In this way, the table should eventually be dropped.
trx_roll_must_shutdown(): During the rollback of recovered transactions,
report progress and check if the rollback should be interrupted because
of a pending shutdown.
trx_roll_max_undo_no, trx_roll_progress_printed_pct: Remove, along with
the messages that were interleaved with other messages.
row_undo_step(), trx_rollback_active(): Abort the rollback of a
recovered ordinary transaction if fast shutdown has been initiated.
trx_rollback_resurrected(): Convert an aborted-rollback transaction
into a fake XA PREPARE transaction, so that fast shutdown can proceed.
trx_rollback_resurrected(): If shutdown was initiated, fake all
remaining active transactions to XA PREPARE state, so that shutdown
can proceed. Also, make the parameter "all" an output that will be
assigned to FALSE in this case.
trx_rollback_or_clean_recovered(): Remove the shutdown check
(it was moved to trx_rollback_resurrected()).
trx_undo_free_prepared(): Relax assertions.
This is 10.1 version where no merge error exists.
wsrep_on_check
New check function. Galera can't be enabled
if innodb-lock-schedule-algorithm=VATS.
innobase_kill_query
In Galera async kill we could own lock mutex.
innobase_init
If Variance-Aware-Transaction-Sheduling Algorithm (VATS) is
used on Galera we refuse to start InnoDB.
Changed innodb-lock-schedule-algorithm as read-only parameter
as it was designed to be.
lock_rec_other_has_expl_req,
lock_rec_other_has_conflicting,
lock_rec_lock_slow
lock_table_other_has_incompatible
lock_rec_insert_check_and_lock
Change pointer to conflicting lock to normal pointer as this
pointer contents could be changed later.
* created tests focusing in multi-master conflicts during cascading foreign key
processing
* in row0upd.cc, calling wsrep_row_ups_check_foreign_constraints only when
running in cluster
* in row0ins.cc fixed regression from MW-369, which caused crash with MW-402.test
With a big buffer pool that contains many data pages,
DISCARD TABLESPACE took a long time, because it would scan the
entire buffer pool to remove any pages that belong to the tablespace.
With a large buffer pool, this would take a lot of time, especially
when the table-to-discard is empty.
The minimum amount of work that DISCARD TABLESPACE must do is to
remove the pages of the to-be-discarded table from the
buf_pool->flush_list because any writes to the data file must be
prevented before the file is deleted.
If DISCARD TABLESPACE does not evict the pages from the buffer pool,
then IMPORT TABLESPACE must do it, because we must prevent pre-DISCARD,
not-yet-evicted pages from being mistaken for pages of the imported
tablespace.
It would not be a useful fix to simply move the buffer pool scan to
the IMPORT TABLESPACE step. What we can do is to actively evict those
pages that could be mistaken for imported pages. In this way, when
importing a small table into a big buffer pool, the import should
still run relatively fast.
Import is bypassing the buffer pool when reading pages for the
adjustment phase. In the adjustment phase, if a page exists in
the buffer pool, we could replace it with the page from the imported
file. Unfortunately I did not get this to work properly, so instead
we will simply evict any matching page from the buffer pool.
buf_page_get_gen(): Implement BUF_EVICT_IF_IN_POOL, a new mode
where the requested page will be evicted if it is found. There
must be no unwritten changes for the page.
buf_remove_t: Remove. Instead, use trx!=NULL to signify that a write
to file is desired, and use a separate parameter bool drop_ahi.
buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Replace buf_remove_t.
buf_LRU_remove_pages(), buf_LRU_remove_all_pages(): Remove.
PageConverter::m_mtr: A dummy mini-transaction buffer
PageConverter::PageConverter(): Complete the member initialization list.
PageConverter::operator()(): Evict any 'shadow' pages from the
buffer pool so that pre-existing (garbage) pages cannot be mistaken
for pages that exist in the being-imported file.
row_discard_tablespace(): Remove a bogus comment that seems to
refer to IMPORT TABLESPACE, not DISCARD TABLESPACE.
ibuf_check_bitmap_on_import(): Only access the pages that
are below FSP_FREE_LIMIT. It is possible that especially with
ROW_FORMAT=COMPRESSED, the FSP_SIZE will be much bigger than
the FSP_FREE_LIMIT, and the bitmap pages (page_size*N, 1+page_size*N)
are filled with zero bytes.
buf_page_is_corrupted(), buf_page_io_complete(): Make the
fault injection compatible with MariaDB 10.2.
Backport the IMPORT tests from 10.2.
On some old GNU/Linux systems, invoking posix_fallocate() with
offset=0 would sometimes cause already allocated bytes in the
data file to be overwritten.
Fix a correctness regression that was introduced in
commit 420798a81a
by invoking posix_fallocate() in a safer way.
A similar change was made in MDEV-5746 earlier.
os_file_get_size(): Avoid changing the state of the file handle,
by invoking fstat() instead of lseek().
os_file_set_size(): Determine the current size of the file
by os_file_get_size(), and then extend the file from that point
onwards.
os_file_set_size(): If posix_fallocate() returns EINVAL, fall back
to writing zero bytes to the file. Also, remove some error log output,
and make it possible for a server shutdown to interrupt the fall-back
code.
MariaDB used to ignore any possible return value from posix_fallocate()
ever since innodb_use_fallocate was introduced in MDEV-4338. If EINVAL
was returned, the file would not be extended.
Starting with MDEV-11520, MariaDB would treat EINVAL as a hard error.
Why is the EINVAL returned? The GNU posix_fallocate() function
would first try the fallocate() system call, which would return
-EOPNOTSUPP for many file systems (notably, not ext4). Then, it
would fall back to extending the file one block at a time by invoking
pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of
the file block size. This would fail with EINVAL if the file is in
O_DIRECT mode, because O_DIRECT requires aligned operation.
When MariaDB 10.1.0 introduced table options for encryption and
compression, it unnecessarily changed
ha_innobase::check_if_supported_inplace_alter() so that ALGORITHM=COPY
is forced when these parameters differ.
A better solution is to move the check to innobase_need_rebuild().
In that way, the ALGORITHM=INPLACE interface (yes, the syntax is
very misleading) can be used for rebuilding the table much more
efficiently, with merge sort, with no undo logging, and allowing
concurrent DML operations.
If InnoDB or XtraDB recovered committed transactions at server
startup, but the processing of recovered transactions was
prevented by innodb_read_only or by innodb_force_recovery,
an assertion would fail at shutdown.
This bug was originally reproduced when Mariabackup executed
InnoDB shutdown after preparing (applying redo log into) a backup.
trx_free_prepared(): Allow TRX_STATE_COMMITTED_IN_MEMORY.
trx_undo_free_prepared(): Allow any undo log state. For transactions
that were resurrected in TRX_STATE_COMMITTED_IN_MEMORY
the undo log state would have been reset by trx_undo_set_state_at_finish().
Replace all references in InnoDB and XtraDB error log messages
to bugs.mysql.com with references to https://jira.mariadb.org/.
The original merge
commit 4274d0bf57
was accidentally reverted by the subsequent merge
commit 3b35d745c3
InnoDB was writing unnecessary information to the
update undo log records. Most notably, if an indexed column is updated,
the old value of the column would be logged twice: first as part of
the update vector, and then another time because it is an indexed column.
Because the InnoDB undo log record must fit in a single page,
this would cause unnecessary failure of certain updates.
Even after this fix, InnoDB still seems to be unnecessarily logging
indexed column values for non-updated columns. It seems that non-updated
secondary index columns only need to be logged when a PRIMARY KEY
column is updated. To reduce risk, we are not fixing this remaining flaw
in GA versions.
trx_undo_page_report_modify(): Log updated indexed columns only once.
i_s_sys_tables_fill_table_stats(): Acquire dict_operation_lock
S-latch before acquiring dict_sys->mutex, to prevent the table
from being removed from the data dictionary cache and from
being freed while i_s_dict_fill_sys_tablestats() is accessing
the table handle.
The ownership of the field query->intersection usually transfers
to query->doc_ids. In some error scenario, it could be possible
that fts_query_free() would be invoked with query->intersection!=NULL.
Let us handle that case, instead of intentionally crashing the server.
When MySQL 5.6.10 introduced innodb_read_only mode, it skipped the
creation of the InnoDB buffer pool dump/restore subsystem in that mode.
Attempts to set the variable innodb_buf_pool_dump_now would have
no effect in innodb_read_only mode, but the corresponding condition
was forgotten in from the other two update functions.
MySQL 5.7.20 would fix the innodb_buffer_pool_load_now,
but not innodb_buffer_pool_load_abort. Let us fix both in MariaDB.