log_free_check(): Assert that the caller must not hold
exclusive lock_sys.latch. This was the case for calls from
ibuf_delete_for_discarded_space(). This caused a deadlock with
another thread that would be holding a latch on a dirty page
that would need to be written so that the checkpoint would advance
and log_free_check() could return. That other thread was waiting
for a shared lock_sys.latch.
fil_delete_tablespace(): Do not invoke ibuf_delete_for_discarded_space()
because in DDL operations, we will be holding exclusive lock_sys.latch.
trx_t::commit(std::vector<pfs_os_file_t>&), innodb_drop_database(),
row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec(),
row_discard_tablespace_for_mysql():
Invoke ibuf_delete_for_discarded_space() on the deleted tablespaces after
releasing all latches.
page_cleaner_flush_pages_recommendation(): If dirty_pct is
between innodb_max_dirty_pages_pct_lwm
and innodb_max_dirty_pages_pct,
scale the effort relative to how close we are to
innodb_max_dirty_pages_pct.
The previous formula was missing a multiplication by 100.
Tested by: Axel Schwenke
This addition to MDEV-30804 is relevant for 10.6+, it excludes
the mixed transaction section using both innodb and aria storage
engines from the galera_var_replicate_aria_off test, since such
transactions cannot be executed unless aria supports two-phase
transaction commit. No additional tests are required as this
commit fixes the mtr test itself.
buf_pool_t::page_cleaner_wakeup(): If for_LRU=true, wake up the page
cleaner immediately, also when it is in a timed wait. This avoids an
unnecessary delay of up to 1 second.
When commit a5a2ef079c
implemented asynchronous doublewrite, the writes via
the doublewrite buffer started to be counted incorrectly,
without multiplying them by innodb_page_size.
srv_export_innodb_status(): Correctly count the
Innodb_data_written.
buf_dblwr_t: Remove submitted(), because it is close to written()
and only Innodb_data_written was interested in it. According to
its name, it should count completed and not submitted writes.
Tested by: Axel Schwenke
If a replica failed to update the GTID slave state when committing
an XA PREPARE, the replica would retry the transaction and get an
out-of-order GTID error. This is because the commit phase of an XA
PREPARE is bifurcated. That is, first, the prepare is handled by the
relevant storage engines. Then second, the GTID slave state is
updated as a separate autocommit transaction. If the second phase
fails, and the transaction is retried, then the same transaction is
attempted to be committed again, resulting in a GTID out-of-order
error.
This patch fixes this error by immediately stopping the slave and
reporting the appropriate error. That is, there was logic to bypass
the error when updating the GTID slave state table if the underlying
error is allowed for retry on a parallel slave. This patch adds a
parameter to disallow the error bypass, thereby forcing the error
state to still happen.
Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
os_aio_wait_until_no_pending_reads(), os_aio_wait_until_pending_writes():
Add a Boolean parameter to indicate whether the wait should be declared
in the thread pool.
buf_flush_wait(): The callers have already declared a wait, so let us
avoid doing that again, just call os_aio_wait_until_pending_writes(false).
buf_flush_wait_flushed(): Do not declare a wait in the rare case that
the buf_flush_page_cleaner thread has been shut down already.
buf_flush_page_cleaner(), buf_flush_buffer_pool(): In the code that runs
during shutdown, do not declare waits.
buf_flush_buffer_pool(): Remove a debug assertion that might fail.
What really matters here is buf_pool.flush_list.count==0.
buf_read_recv_pages(), srv_prepare_to_delete_redo_log_file():
Do not declare waits during InnoDB startup.
Fixing buildbot failures on mariabackup.aria_log_dir_path_rel.
The problem was that directory_exists() was called with the
relative aria_log_dir_path value, while the current directory
in mariadb-backup is not necessarily equal to datadir when MTR is running.
Fix:
- Moving building the absolute path un level upper:
from the function copy_back_aria_logs() to the function copy_back().
- Passing the built absolute path to both directory_exists() and
copy_back_aria_logs() as a parameter.
- This patch does the following:
git revert --no-commit 673243c893
git revert --no-commit 6c669b9586
git revert --no-commit bacaf2d4f4
git checkout HEAD mysql-test
git revert --no-commit 1fd7d3a9ad
Above command reverts MDEV-29277, MDEV-25581, MDEV-29342.
When binlog is enabled, trasaction takes a lot of time to do
sync operation on innodb fts table. This leads to block
of other transaction commit. To avoid this failure, remove
the fulltext sync operation during transaction commit. So
reverted MDEV-25581 related patches.
We filed MDEV-31105 to avoid the memory consumption
problem during fulltext sync operation.
This bug caused server crash when processing a multi-update statement that
used views if optimizer tracing was enabled.
The bug was introduced in the patch for MDEV-30539 that could incorrectly
detect the most top level selects of queries if views were used in them.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- `mariadb-backup --backup` was fixed to fetch the value of the
@@aria_log_dir_path server variable and copy aria_log* files
from @@aria_log_dir_path directory to the backup directory.
Absolute and relative (to --datadir) paths are supported.
Before this change aria_log* files were copied to the backup
only if they were in the default location in @@datadir.
- `mariadb-backup --copy-back` now understands a new my.cnf and command line
parameter --aria-log-dir-path.
`mariadb-backup --copy-back` in the main loop in copy_back()
(when copying back from the backup directory to --datadir)
was fixed to ignore all aria_log* files.
A new function copy_back_aria_logs() was added.
It consists of a separate loop copying back aria_log* files from
the backup directory to the directory specified in --aria-log-dir-path.
Absolute and relative (to --datadir) paths are supported.
If --aria-log-dir-path is not specified,
aria_log* files are copied to --datadir by default.
- The function is_absolute_path() was fixed to understand MTR style
paths on Windows with forward slashes, e.g.
--aria-log-dir-path=D:/Buildbot/amd64-windows/build/mysql-test/var/...
The motivation of this change is to allow undo pages for temporary tables
to be marked free as often as possible, so that we can avoid buf_pool.LRU
eviction (and writes) of undo pages that contain data that is
no longer needed. For temporary tables, no MVCC or purge of history
is needed, and reusing cached undo log pages might not help that much.
It is possible that this may cause some performance regression due to
more frequent allocation and freeing of undo log pages, but I only
measured a performance improvement.
trx_write_serialisation_history(): Never cache temporary undo log pages.
trx_undo_reuse_cached(): Assert that the rollback segment is persistent.
trx_undo_assign_low(): Add template<bool is_temp>. Never invoke
trx_undo_reuse_cached() for temporary tables.
Tested by: Matthias Leich
Let us remove explicit updates of MONITOR_NUM_UNDO_SLOT_USED
and MONITOR_NUM_UNDO_SLOT_CACHED, and let us compute the rough values
from trx_sys.rseg_array[] on demand.
trx_purge_truncate_rseg_history(): If all other conditions for
invoking trx_purge_remove_log_hdr() hold, but the state is
TRX_UNDO_CACHED instead of TRX_UNDO_TO_PURGE, detach and free it.
Tested by: Matthias Leich
buf_LRU_get_free_block(): Always wake up the page cleaner if needed
before exiting the inner loop.
srv_prepare_to_delete_redo_log_file():
Replace a debug assertion with a wait in debug builds.
Starting with commit 7e31a8e7fa
the debug assertion ut_ad(!os_aio_pending_writes())
could occasionally fail, while it would hold in core dumps of crashes.
The failure can be reproduced more easily by adding a sleep to the
write completion callback function, right before releasing to
write_slots.
srv_start(): Remove a bogus debug assertion
ut_ad(!os_aio_pending_writes()) that could fail in
mariadb-backup --prepare. In an rr replay trace, we had
buf_pool.flush_list.count==0 but write_slots->m_cache.m_pos==1
and buf_page_t::write_complete() was executing u_unlock().
fp->field_length was unsigned and therefore the negative
condition around it.
Backport of cc182aca93 fixes it, however to correct the
consistent use of types pcf->Length needs to be unsigned
too.
At one point pcf->Precision is assigned from pcf->Length so
that's also unsigned.
GetTypeSize is assigned to length and has a length argument.
A -1 default value seemed dangerious to case, so at least 0
should assert if every hit.
buf_flush_wait_flushed(): Correct the logic for registering a wait
around buf_flush_wait() that
commit a091d6ac4e
recently broke. This should be easily repeatable when using a
non-default startup parameter:
thread-handling=pool-of-threads
trx_purge_free_segment(): The buffer-fix only prevents a block from
being freed completely from the buffer pool, but it will not prevent
the block from being evicted. Recheck the page identifier after
acquiring an exclusive page latch. If it has changed, backtrack and
invoke buf_page_get_gen() to look up the page normally.
Similar to 567b6812 continue to replace use of strcat() and
strcpy() with safer options strncat() and strncpy().
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services
io_callback(): Process the request before releasing the write slot.
Before commit a091d6ac4e
when we had a duplicated counter for writes, either ordering was fine.
Now, correctness depends on os_aio_wait_until_no_pending_writes().
There is an assumption that when there are are no completed tests,
that means they are still running and then an attempt is made to
identify these tests as stalled.
The other possibility is however there are no tests that where run.
Test this early and then exit quickly and no later misunderstandings
need to be made.
buf_flush_LRU_list_batch(): When evicting clean pages,
release and reacquire the buf_pool.mutex after every 32 pages.
Also, eliminate some conditional branches.
fil_space_t::create(), fil_space_t::add(): Expect the caller to
acquire and release fil_system.mutex. In this way, creating a tablespace
and adding the first (usually only) data file will be atomic.
recv_sys_t::recover_deferred(): Correctly protect some changes by
holding fil_system.mutex.
Tested by: Matthias Leich
do_shutdown_server(): After sending SIGKILL, invoke wait_until_dead().
Thanks to Sergei Golubchik for pointing out that the previous fix
does not actually work.
trx_assign_rseg_low(): Let us restore the debug variable look_for_rollover
to avoid assertion failures when a server that was created with
multiple undo tablespaces is being started with innodb_undo_tablespaces=0.
stored externally
row_merge_buf_add(): Has strict assert that fixed length mismatch
shouldn't happen while rebuilding the redundant row format table
btr_index_rec_validate(): Fixed size column can be stored externally.
So sum of inline stored length and external stored length of the
column should be equal to total column length
This issue happens when race condition happens when DDL
and fts optimize thread. DDL adds the new index to fts cache.
At the same time, fts optimize thread clears the cache
and reinitialize it. Take cache init lock before reinitializing
the cache. fts_sync_commit() should take dict_sys mutex
to avoid the deadlock with create index.
do_shutdown_server(): Call wait_until_dead() also when we are forcibly
killing the process (timeout=0). We have evidence that killing
the process may take some time and cause mystery failures in
crash recovery tests. For InnoDB, several failures were observed between
commit da094188f6 and
commit 0ee1082bd2
when no advisory file locking was being used by default.
Assertion `thd->mdl_context.is_lock_owner()` fires when a client is
disconnected, while transaction and and a table is opened through
`HANDLER` interface.
Reason for the assertion is that when a connection closes, its ongoing
transaction is eventually rolled back in
`Wsrep_client_state::bf_rollback()`. This method also releases explicit
which are expected to survive beyond the transaction lifetime.
This patch also removes calls to `mysql_ull_cleanup()`. User level
locks are not supported in combination with Galera, making these calls
unnecessary.
trx_assign_rseg_low(): Simplify the debug check.
trx_rseg_t::reinit(): Reset the skip_allocation() flag.
This logic was broken in the merge
commit 3e2ad0e918
of commit 0de3be8cfd
(that is, innodb_undo_log_truncate=ON would never be "completed").
Tested by: Matthias Leich
A GROUP BY query which uses "MIN(pk)" and has "pk<>const" in the
WHERE clause would produce wrong result when handled with "Using index
for group-by". Here "pk" column is the table's primary key.
The problem was introduced by fix for MDEV-23634. It made the range
optimizer to not produce ranges for conditions in form "pk != const".
However, LooseScan code requires that the optimizer is able to
convert the condition on the MIN/MAX column into an equivalent range.
The range is used to locate the row that has the MIN/MAX value.
LooseScan checks this in check_group_min_max_predicates(). This fix
makes the code in that function to take into account that "pk != const"
does not produce a range.
If SQL_MODE contains ANSI_QUOTES (https://mariadb.com/kb/en/sql-mode/), then
the double-quote character (") is not a legal string delimiter.
In 13e77930e6 (diff-a333d4ebb2d73b6361ef7dfebc86d883f7e19853b4a9eb85984b039058fae47cR2431-R2435),
Daniel Black introduced a case where the double-quote character would be used as
a string delimiter in the SQL queries generated by mariadb-tzinfo-to-sql.
This tool tool generates SQL queries which should be able to run on any
MariaDB server of the matching version. Therefore, it should be extremely
conservative in the SQL that it outputs, in order to maximize the chance
that it can run regardless of the build or execution environment of the
server.
See MDEV-18778, MDEV-28263, and MDEV-28782 for previous cases where MariaDB
has FAILED TO ENSURE that the generated timezone.sql actually works in
different build and execution environments. More test coverage is clearly
needed here.
All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
The 2013 error was right to catch the case B of the test unprepared
for an expected simulated crash.
The test gets refined to SELECT a (type of) bool value before the
crash is invoked.
Appending to 'eatmydata' will obviously cause an executable that
doesn't exist. Use an array to create the entire executable.
Also while we are at it, check the fakeroot actually works before
using it.
The glibc headers declare fallocate only if _GNU_SOURCE is defined.
Without this change, the probe fails with C compilers which do not
support implicit function declarations even if the system does in
fact support the fallocate function.
Upstream rocksdb does not need this because the probe is run with the
C++ compiler, and current g++ versions define _GNU_SOURCE
automatically.
fil_delete_tablespace() stores file handle in local variable and calls
mtr_t::commit_file()=>fil_system_t::detach(..., detach_handle=true), which
sets space->chain.start->handle = OS_FILE_CLOSED. fil_system_t::detach()
is invoked under fil_system.mutex.
But before the mutex is acquired some parallel thread can change
space->chain.start->handle. fil_delete_tablespace() returns value, stored
in local variable, i.e. wrong value.
File handle can be closed, for example, from buf_flush_space() when the
limit of innodb_open_files exceded and fil_space_t::get() causes
fil_space_t::try_to_close() call.
fil_space_t::try_to_close() is executed under fil_system.mutex. And
mtr_t::commit_file() locks it for fil_system_t::detach() call.
fil_system_t::detach() returns detached file handle if its argument
detach_handle is true. The fix is to let mtr_t::commit_file() to pass
that detached file handle to fil_delete_tablespace().
Post-push fix.
10.5 MDEV-30775 fix inserts just opened tablespace just after the element
which fil_system.space_list_last_opened points to.
In MDEV-25223 fil_system_t::space_list was changed from UT_LIST to
ilist. ilist<...>::insert(iterator pos, reference value) inserts element
to list before pos.
But it was not taken into account during 10.5->10.6 merge in
85cbfaefee, and the fix
does not work properly, i.e. it inserted just opened tablespace to the
position preceding fil_system.space_list_last_opened.