There is several misindentation inside Debian post and pre
installation scripts. False indentation with space
as indent space should be 2 and indentation with tabs.
This bug could cause a crash of the server when processing a query with
ROWNUM() if it used in its FROM list a reference to a mergeable view
defined as SELECT over more than one table that contained ORDER BY clause.
When a mergeable view with ORDER BY clause and without LIMIT clause is used
in the FROM list of a query that does not have ORDER BY clause the ORDER BY
clause of the view is moved to the query. The code that performed this
transformation forgot to delete the moved ORDER BY list from the view.
If a query contains ROWNUM() and uses a mergeable multi-table view with
ORDER BY then according to the current code of TABLE_LIST::init_derived()
the view has to be forcibly materialized. As the query and the view shared
the same items in its ORDER BY lists they could not be properly resolved
either in the query or in the view. This led to a crash of the server.
This patch has returned back the original signature of LEX::can_not_use_merged()
to comply with 10.4 code of the condition that checks whether a megeable
view has to be forcibly materialized.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
To make sure that PMEM problem does not happen again sync
10.9 and up debian/rules to make sure that they are not making
anymore problems with autobake-debs.sh
When processing a query over a mergeable view at some conditions checked
at prepare stage it may be decided to materialize the view rather than
to merge it. Before this patch in such case the field 'derived' of the
TABLE_LIST structure created for the view remained set to 0. As a result
the guard condition preventing range analysis for materialized views did
not work properly. This led to a call of some handler method for the
temporary table created to contain the view's records that was supposed
to be used only for opened tables. However temporary tables created for
materialization of derived tables or views are not opened yet when range
analysis is performed.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
Introduce @@optimizer_switch flag: hash_join_cardinality
When it is on, use EITS statistics to produce tighter bounds for
hash join output cardinality.
Amended by Monty.
Reviewed by: Monty <monty@mariadb.org>
Now the same rule applied to vews and derived tables. So we should
allow merge of views (and derived) in queries with rownum, because
it do not change results, only makes query plans better.
The InnoDB buffer pool and locking were heavily refactored in
MariaDB Server 10.6. Among other things, dict_sys.mutex was removed,
and the contended lock_sys.mutex was replaced with a combination of
lock_sys.latch and distributed latches in hash tables. Also, a
default value was changed to innodb_flush_method=O_DIRECT to improve
performance in write-heavy workloads.
One thing where an adjustment was missing is around the parameters
innodb_max_purge_lag (number of committed transactions waiting to
be purged), and innodb_max_purge_lag_delay
(maximum number of microseconds to delay a DML operation).
purge_coordinator_state::do_purge(): Pass the history_size to trx_purge()
and reset srv_dml_needed_delay if the history is empty.
Keep executing the loop non-stop as long as srv_dml_needed_delay is set.
trx_purge_dml_delay(): Made part of trx_purge().
Set srv_dml_needed_delay=0 when nothing can be purged (!n_pages_handled).
row_mysql_delay_if_needed(): Mimic the logic of
innodb_max_purge_lag_wait_update().
Reviewed by: Thirunarayanan Balathandayuthapani
The old code added to 10.6 was inconsisting in how TCP/IP and
socket connection was chosen. One got also a confusing warning
in some cases.
Examples:
> ../client/mysql --print-defaults
../client/mysql would have been started with the following arguments:
--socket=/tmp/mariadbd.sock --port=3307 --no-auto-rehash
> ../client/mysql
ERROR 2002 (HY000): Can't connect to local server through socket '/tmp/mariadbd.sock' (2)
> ../client/mysql --print-defaults
../client/mysql would have been started with the following arguments:
--socket=/tmp/mariadbd.sock --port=3307 --no-auto-rehash
> ../client/mysql --port=3333
WARNING: Forcing protocol to TCP due to option specification. Please explicitly state intended protocol.
ERROR 2002 (HY000): Can't connect to server on 'localhost' (111)
> ../client/mysql --port=3333 --socket=sss
ERROR 2002 (HY000): Can't connect to local server through socket 'sss' (2)
> ../client/mysql --socket=sss --port=3333
ERROR 2002 (HY000): Can't connect to local server through socket 'sss' (2)
Some notable things:
- One gets a warning if one uses just --port if config file sets socket
- Using port and socket gives no warning
- Using socket and then port still uses socket
This patch changes things the following ways:
If --port= is given on the command line, the the protocol is automatically
changed to "TCP/IP".
- If --socket= is given on the command line, the protocol is automatically
changed to "socket".
- The last option wins
- No warning is given if protocol changes automatically.
AWK in used in Debian SysV-init and postinst scripts to determine
is there enough space starting MariaDB database or create new
database to target destination.
These AWK scripts can be rewrited to use pure SH or help
using Coreutils which is mandatory for usage of MariaDB currently.
Reasoning behind this is to get rid of one very less used dependency
log_free_check(): Assert that the caller must not hold
exclusive lock_sys.latch. This was the case for calls from
ibuf_delete_for_discarded_space(). This caused a deadlock with
another thread that would be holding a latch on a dirty page
that would need to be written so that the checkpoint would advance
and log_free_check() could return. That other thread was waiting
for a shared lock_sys.latch.
fil_delete_tablespace(): Do not invoke ibuf_delete_for_discarded_space()
because in DDL operations, we will be holding exclusive lock_sys.latch.
trx_t::commit(std::vector<pfs_os_file_t>&), innodb_drop_database(),
row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec(),
row_discard_tablespace_for_mysql():
Invoke ibuf_delete_for_discarded_space() on the deleted tablespaces after
releasing all latches.
page_cleaner_flush_pages_recommendation(): If dirty_pct is
between innodb_max_dirty_pages_pct_lwm
and innodb_max_dirty_pages_pct,
scale the effort relative to how close we are to
innodb_max_dirty_pages_pct.
The previous formula was missing a multiplication by 100.
Tested by: Axel Schwenke
This addition to MDEV-30804 is relevant for 10.6+, it excludes
the mixed transaction section using both innodb and aria storage
engines from the galera_var_replicate_aria_off test, since such
transactions cannot be executed unless aria supports two-phase
transaction commit. No additional tests are required as this
commit fixes the mtr test itself.
The error was seen by a number of mtr tests being caused
by overdue initialization of rpl_parallel::LOCK_parallel_entry.
Specifically, SHOW-SLAVE-STATUS might find in
rpl_parallel::workers_idle() a gtid domain hash entry
already inserted whose mutex had not done
mysql_mutex_init().
Fixed with swapping the mutex init and the its entry's stack insertion.
Tested with a generous number of `mtr --repeat` of a few of the reported
to fail tests, incl rpl.parallel_backup.
buf_pool_t::page_cleaner_wakeup(): If for_LRU=true, wake up the page
cleaner immediately, also when it is in a timed wait. This avoids an
unnecessary delay of up to 1 second.
When commit a5a2ef079c
implemented asynchronous doublewrite, the writes via
the doublewrite buffer started to be counted incorrectly,
without multiplying them by innodb_page_size.
srv_export_innodb_status(): Correctly count the
Innodb_data_written.
buf_dblwr_t: Remove submitted(), because it is close to written()
and only Innodb_data_written was interested in it. According to
its name, it should count completed and not submitted writes.
Tested by: Axel Schwenke
If a replica failed to update the GTID slave state when committing
an XA PREPARE, the replica would retry the transaction and get an
out-of-order GTID error. This is because the commit phase of an XA
PREPARE is bifurcated. That is, first, the prepare is handled by the
relevant storage engines. Then second, the GTID slave state is
updated as a separate autocommit transaction. If the second phase
fails, and the transaction is retried, then the same transaction is
attempted to be committed again, resulting in a GTID out-of-order
error.
This patch fixes this error by immediately stopping the slave and
reporting the appropriate error. That is, there was logic to bypass
the error when updating the GTID slave state table if the underlying
error is allowed for retry on a parallel slave. This patch adds a
parameter to disallow the error bypass, thereby forcing the error
state to still happen.
Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
When replicating MDL events for a table that uses system versioning
without primary keys, ensure that for data sets with duplicate
records, the updates to these records with duplicates are enacted on
the correct row. That is, there was a bug (reported in MDEV-30430)
such that the function to find the row to update would stop after
finding the first matching record. However, in the absence of
primary keys, the version of the record is needed to compare the row
to ensure we are updating the correct one.
The fix, therefore, updates the record comparison functionality to
use system version columns when there are no primary keys on the
table.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
Problem:
========
A master can segfault if it can't set up decryption for its binary
log during a binlog dump with Using_Gtid=Slave_Pos. If slave
connects using GTID mode, the master will call into
log.cc::get_gtid_list_event(), which iterate through binlog events
looking for a Gtid_list_log_event. On an encrypted binlog that the
master cannot decrypt, the first event will be a
START_ENCRYPTION_EVENT which will call into the following decryption branch
if (fdle->start_decryption((Start_encryption_log_event*) ev))
errormsg= ‘Could not set up decryption for binlog.’;
The event iteration however, does not stop in spite of this error.
The master will try to read the next event, but segfault while
trying to decrypt it because decryption failed to initialize.
Solution:
========
Break the event iteration if decryption cannot be set up.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
This bug could manifest itself at the first execution of prepared statement
created for queries using a materialized view defined as union. A crash
could happen for sure if the query contained a condition pushable into
the view and this condition was over the column defined via a complex string
expression requiring implicit conversion from one charset to another for
some of its sub-expressions. The bug could cause crashes when executing
PS for some other queries whose optimization needed building clones for
such expressions.
This bug was introduced in the patch for MDEV-29988 where the class
Item_direct_ref_to_item was added. The implementations of the virtual
methods get_copy() and build_clone() were invalid for the class and this
could cause crashes after the method build_clone() was called for
expressions containing objects of the Item_direct_ref_to_item type.
Approved by Sergei Golubchik <serg@mariadb.com>
os_aio_wait_until_no_pending_reads(), os_aio_wait_until_pending_writes():
Add a Boolean parameter to indicate whether the wait should be declared
in the thread pool.
buf_flush_wait(): The callers have already declared a wait, so let us
avoid doing that again, just call os_aio_wait_until_pending_writes(false).
buf_flush_wait_flushed(): Do not declare a wait in the rare case that
the buf_flush_page_cleaner thread has been shut down already.
buf_flush_page_cleaner(), buf_flush_buffer_pool(): In the code that runs
during shutdown, do not declare waits.
buf_flush_buffer_pool(): Remove a debug assertion that might fail.
What really matters here is buf_pool.flush_list.count==0.
buf_read_recv_pages(), srv_prepare_to_delete_redo_log_file():
Do not declare waits during InnoDB startup.
Fixing buildbot failures on mariabackup.aria_log_dir_path_rel.
The problem was that directory_exists() was called with the
relative aria_log_dir_path value, while the current directory
in mariadb-backup is not necessarily equal to datadir when MTR is running.
Fix:
- Moving building the absolute path un level upper:
from the function copy_back_aria_logs() to the function copy_back().
- Passing the built absolute path to both directory_exists() and
copy_back_aria_logs() as a parameter.
- This patch does the following:
git revert --no-commit 673243c893
git revert --no-commit 6c669b9586
git revert --no-commit bacaf2d4f4
git checkout HEAD mysql-test
git revert --no-commit 1fd7d3a9ad
Above command reverts MDEV-29277, MDEV-25581, MDEV-29342.
When binlog is enabled, trasaction takes a lot of time to do
sync operation on innodb fts table. This leads to block
of other transaction commit. To avoid this failure, remove
the fulltext sync operation during transaction commit. So
reverted MDEV-25581 related patches.
We filed MDEV-31105 to avoid the memory consumption
problem during fulltext sync operation.
This bug caused server crash when processing a multi-update statement that
used views if optimizer tracing was enabled.
The bug was introduced in the patch for MDEV-30539 that could incorrectly
detect the most top level selects of queries if views were used in them.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
- `mariadb-backup --backup` was fixed to fetch the value of the
@@aria_log_dir_path server variable and copy aria_log* files
from @@aria_log_dir_path directory to the backup directory.
Absolute and relative (to --datadir) paths are supported.
Before this change aria_log* files were copied to the backup
only if they were in the default location in @@datadir.
- `mariadb-backup --copy-back` now understands a new my.cnf and command line
parameter --aria-log-dir-path.
`mariadb-backup --copy-back` in the main loop in copy_back()
(when copying back from the backup directory to --datadir)
was fixed to ignore all aria_log* files.
A new function copy_back_aria_logs() was added.
It consists of a separate loop copying back aria_log* files from
the backup directory to the directory specified in --aria-log-dir-path.
Absolute and relative (to --datadir) paths are supported.
If --aria-log-dir-path is not specified,
aria_log* files are copied to --datadir by default.
- The function is_absolute_path() was fixed to understand MTR style
paths on Windows with forward slashes, e.g.
--aria-log-dir-path=D:/Buildbot/amd64-windows/build/mysql-test/var/...
The motivation of this change is to allow undo pages for temporary tables
to be marked free as often as possible, so that we can avoid buf_pool.LRU
eviction (and writes) of undo pages that contain data that is
no longer needed. For temporary tables, no MVCC or purge of history
is needed, and reusing cached undo log pages might not help that much.
It is possible that this may cause some performance regression due to
more frequent allocation and freeing of undo log pages, but I only
measured a performance improvement.
trx_write_serialisation_history(): Never cache temporary undo log pages.
trx_undo_reuse_cached(): Assert that the rollback segment is persistent.
trx_undo_assign_low(): Add template<bool is_temp>. Never invoke
trx_undo_reuse_cached() for temporary tables.
Tested by: Matthias Leich
Let us remove explicit updates of MONITOR_NUM_UNDO_SLOT_USED
and MONITOR_NUM_UNDO_SLOT_CACHED, and let us compute the rough values
from trx_sys.rseg_array[] on demand.
trx_purge_truncate_rseg_history(): If all other conditions for
invoking trx_purge_remove_log_hdr() hold, but the state is
TRX_UNDO_CACHED instead of TRX_UNDO_TO_PURGE, detach and free it.
Tested by: Matthias Leich
buf_LRU_get_free_block(): Always wake up the page cleaner if needed
before exiting the inner loop.
srv_prepare_to_delete_redo_log_file():
Replace a debug assertion with a wait in debug builds.
Starting with commit 7e31a8e7fa
the debug assertion ut_ad(!os_aio_pending_writes())
could occasionally fail, while it would hold in core dumps of crashes.
The failure can be reproduced more easily by adding a sleep to the
write completion callback function, right before releasing to
write_slots.
srv_start(): Remove a bogus debug assertion
ut_ad(!os_aio_pending_writes()) that could fail in
mariadb-backup --prepare. In an rr replay trace, we had
buf_pool.flush_list.count==0 but write_slots->m_cache.m_pos==1
and buf_page_t::write_complete() was executing u_unlock().
fp->field_length was unsigned and therefore the negative
condition around it.
Backport of cc182aca93 fixes it, however to correct the
consistent use of types pcf->Length needs to be unsigned
too.
At one point pcf->Precision is assigned from pcf->Length so
that's also unsigned.
GetTypeSize is assigned to length and has a length argument.
A -1 default value seemed dangerious to case, so at least 0
should assert if every hit.
buf_flush_wait_flushed(): Correct the logic for registering a wait
around buf_flush_wait() that
commit a091d6ac4e
recently broke. This should be easily repeatable when using a
non-default startup parameter:
thread-handling=pool-of-threads
trx_purge_free_segment(): The buffer-fix only prevents a block from
being freed completely from the buffer pool, but it will not prevent
the block from being evicted. Recheck the page identifier after
acquiring an exclusive page latch. If it has changed, backtrack and
invoke buf_page_get_gen() to look up the page normally.
Similar to 567b6812 continue to replace use of strcat() and
strcpy() with safer options strncat() and strncpy().
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services
io_callback(): Process the request before releasing the write slot.
Before commit a091d6ac4e
when we had a duplicated counter for writes, either ordering was fine.
Now, correctness depends on os_aio_wait_until_no_pending_writes().
There is an assumption that when there are are no completed tests,
that means they are still running and then an attempt is made to
identify these tests as stalled.
The other possibility is however there are no tests that where run.
Test this early and then exit quickly and no later misunderstandings
need to be made.
buf_flush_LRU_list_batch(): When evicting clean pages,
release and reacquire the buf_pool.mutex after every 32 pages.
Also, eliminate some conditional branches.
fil_space_t::create(), fil_space_t::add(): Expect the caller to
acquire and release fil_system.mutex. In this way, creating a tablespace
and adding the first (usually only) data file will be atomic.
recv_sys_t::recover_deferred(): Correctly protect some changes by
holding fil_system.mutex.
Tested by: Matthias Leich