The original test in the report of MDEV-31463 is contrived and
nondeterministic, causing MDEV-31586. We update the test to make it
more directly addresses the underlying cause of MDEV-31463, namely
errors from queries sent to the data node not consumed when trying to
set lock wait timeout. This is achieved through the debug sync
facility.
Remove the exception that InnoDB does not report auto-increment locks waits
to the parallel replication.
There was an assumption that these waits could not cause conflicts with
in-order parallel replication and thus need not be reported. However, this
assumption is wrong and it is possible to get conflicts that lead to hangs
for the duration of --innodb-lock-wait-timeout. This can be seen with three
transactions:
1. T1 is waiting for T3 on an autoinc lock
2. T2 is waiting for T1 to commit
3. T3 is waiting on a normal row lock held by T2
Here, T3 needs to be deadlock killed on the wait by T1.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
Restore code to make InnoDB choose the second transaction as a deadlock
victim if two transactions deadlock that need to commit in-order for
parallel replication. This code was erroneously removed when VATS was
implemented in InnoDB.
Also add a test case for InnoDB choosing the right deadlock victim.
Also fixes this bug, with testcase that reliably reproduces:
MDEV-28776: rpl.rpl_mark_optimize_tbl_ddl fails with timeout on sync_with_master
Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
- InnoDB aborts when table is dropping the column. This is
caused by 5f09b53bdb (MDEV-31086).
While iterating the altered table fields, we fail to consider
the dropped columns.
when validating vcol's (default, check, etc) in ALTER TABLE
vcol_info->flags are modified in place. This means that if ALTER TABLE
fails for any reason we need to restore them to their original values.
(mroonga was freeing the memory on ::reset() but not on ::close())
recv_scan_log(): On recv_sys_t::PREMATURE_EOF, keep reading more log
if recv_sys.lsn < recv_sys.scanned_lsn.
recv_recovery_from_checkpoint_start(): Add a safety check to abort
crash recovery if recv_sys.lsn is not recv_sys.scanned_lsn.
This fixes a serious database corruption bug that was introduced by
commit 2f9e264781 (MDEV-29911).
recv_ring::copy_if_needed(): If the record wraps around the
memory-mapped ib_logfile0, do copy it also if len==0
(the record consists only of a header, like FREE_PAGE and INIT_PAGE
records do).
recv_sys_t::parse(): Invoke recv_ring::copy_if_needed() for INIT_PAGE
and FREE_PAGE records, so that if these records wrap around the
memory-mapped ib_logfile0, they will be correctly copied to
recv_sys.pages.
Together with commit 0d175968d1 (MDEV-31354)
this fixes occasional failures of the test innodb.recovery_memory.
mark old keys in the ALTER TABLE with the `old` flag, not with
the `key_create_info.check_for_duplicate_indexes`.
This allows to mark old foreign keys too.
HA_UNIQUE_CHECK was
* only used internally by MyISAM/Aria
* only used for internal temporary tables (for DISTINCT)
* never saved in frm
* saved in MYI/MAD but only for temporary tables
* only set, never checked
it's safe to remove it and free the bit (there are only 16 of them)
buf_page_t::write_complete(), buf_page_write_complete(),
IORequest::write_complete(): Add a parameter for passing
an error code. If an error occurred, we will release the
io-fix, buffer-fix and page latch but not reset the
oldest_modification field. The block would remain in
buf_pool.LRU and possibly buf_pool.flush_list, to be written
again later, by buf_flush_page_cleaner(). If all page writes
start consistently failing, all write threads should eventually
hang in log_free_check() because the log checkpoint cannot
be advanced to make room in the circular write-ahead-log ib_logfile0.
IORequest::read_complete(): Add a parameter for passing
an error code. If a read operation fails, we report the error
and discard the page, just like we would do if the page checksum
was not validated or the page could not be decrypted.
This only affects asynchronous reads, due to linear or random read-ahead
or crash recovery. When buf_page_get_low() invokes buf_read_page(),
that will be a synchronous read, not involving this code.
This was tested by randomly injecting errors in
write_io_callback() and read_io_callback(), like this:
if (!ut_rnd_interval(100))
cb->m_err= 42;
buf_LRU_free_page(): When we are discarding the uncompressed copy of a
ROW_FORMAT=COMPRESSED page, buf_page_t::can_relocate() must have ensured
that the block descriptor state is one of FREED, UNFIXED, REINIT.
Do not overwrite the state with UNFIXED. We do not want to write back
pages that were actually freed, and we want to avoid doublewrite for
pages that were (re)initialized by log records written since the latest
checkpoint. Last but not least, we do not want crashes like those that
commit dc1bd1802a (MDEV-31386)
was supposed to fix.
The test innodb_zip.wl5522_zip should typically cover all 3 states.
This bug is a regression due to
commit aaef2e1d8c (MDEV-27058).
Problem:
========
- InnoDB scans the complete redo log to ensure that there is
no corruption and to find the end of the log. During this scan,
InnoDB saves all the freed ranges, but it doesn't save
recovered size. Later, InnoDB recovery applies partial
redo logs and IO thread tries to flush the all freed
ranges which was noted during previous complete scan of redo logs.
Fix:
====
InnoDB should store the freed pages only when InnoDB stores
the redo log records.
log_sort_flush_list(): Wait for any pending page writes to cease
before sorting the buf_pool.flush_list. Starting with
commit 22b62edaed (MDEV-25113),
it is possible that some buf_page_t::oldest_modification_
that we will be comparing in std::sort() will be updated from
some value >2 to 1 while we are holding buf_pool.flush_list_mutex.
To catch this type of trouble better in the future, we will clean
garbage (pages that have been written out) from buf_pool.flush_list
while constructing the array for sorting, and check with debug
assertions that all blocks that we are copying from the array to the
list will be dirty (requiring a writeback) while we sort and copy
the array back to buf_pool.flush_list.
This failure was observed by chance exactly once when running the test
innodb.recovery_memory. It was never reproduced in the same form
afterwards. Unrelated to this change, the test does occasionally
reproduce a failure to start up InnoDB due to a corrupted page
being read in recovery. The ticket MDEV-31791 was filed for that.
Tested by: Matthias Leich
- InnoDB fails to update the autoinc persistently after
bulk insert operation.
row_merge_bulk_t::write_to_index(): Update the autoinc value
persistently
This patch adds for "--ps-protocol" second execution
of queries "SELECT".
Also in this patch it is added ability to disable/enable
(--disable_ps2_protocol/--enable_ps2_protocol) second
execution for "--ps-prototocol" in testcases.