This is for Oracle compatiblity. ENABLED is in Oracle the default case
and just ensures that the NOT NULL constraints will be tested, which is
also default in MariaDB
A side effect of MDEV-16264 is that a large number of threads will
be created at server startup, to be destroyed after a minute or two.
One source of such thread creation is srv_start_periodic_timer().
InnoDB is creating 3 periodic tasks: srv_master_callback (1Hz)
srv_error_monitor_task (1Hz), and srv_monitor_task (0.2Hz).
It appears that we can merge srv_error_monitor_task and srv_monitor_task
and have them invoked 4 times per minute (every 15 seconds). This will
affect our ability to enforce innodb_fatal_semaphore_wait_threshold and
some computations around BUF_LRU_STAT_N_INTERVAL.
We could remove srv_master_callback along with the DROP TABLE queue
at some point of time in the future. We must keep it independent
of the innodb_fatal_semaphore_wait_threshold detection, because
the background DROP TABLE queue could get stuck due to dict_sys
being locked by another thread. For now, srv_master_callback
must be invoked once per second, so that
innodb_flush_log_at_timeout=1 can work.
BUF_LRU_STAT_N_INTERVAL: Reduce the precision and extend the time
from 50*1 second to 4*15 seconds.
srv_error_monitor_timer: Remove.
MAX_MUTEX_NOWAIT: Increase from 20*1 second to 2*15 seconds.
srv_refresh_innodb_monitor_stats(): Avoid a repeated call to time(NULL).
Change the interval to less than 60 seconds.
srv_monitor(): Renamed from srv_monitor_task.
srv_monitor_task(): Renamed from srv_error_monitor_task().
Invoked only once in 15 seconds. Invoke also srv_monitor().
Increase the fatal_cnt threshold from 10*1 second to 1*15 seconds.
sync_array_print_long_waits_low(): Invoke time(NULL) only once.
Remove a bogus message about printouts for 30 seconds. Those
printouts were effectively already disabled in MDEV-16264
(commit 5e62b6a5e0).
The purpose of the InnoDB page cleaner subsystem is to write out
modified pages from the buffer pool to data files. When the
innodb_max_dirty_pages_pct_lwm is not exceeded or
innodb_adaptive_flushing=ON decides not to write out anything,
the page cleaner should keep sleeping indefinitely until the state
of the system changes: a dirty page is added to the buffer pool such
that the page cleaner would no longer be idle.
buf_flush_page_cleaner(): Explicitly note when the page cleaner is idle.
When that happens, use mysql_cond_wait() instead of mysql_cond_timedwait().
buf_flush_insert_into_flush_list(): Wake up the page cleaner if needed.
innodb_max_dirty_pages_pct_update(),
innodb_max_dirty_pages_pct_lwm_update():
Wake up the page cleaner just in case.
Note: buf_flush_ahead(), buf_flush_wait_flushed() and shutdown are
already waking up the page cleaner thread.
tpool::aio::N_PENDING: Replaces OS_AIO_N_PENDING_IOS_PER_THREAD.
This limits two similar things: the number of outstanding requests
that a thread may io_submit(), and the number of completed requests
collected at a time by io_getevents().
In the asynchronous I/O interface, InnoDB is invoking io_getevents()
with a timeout value of half a second, and requesting exactly 1 event
at a time.
The reason to have such a short timeout is to facilitate shutdown.
We can do better: Use an infinite timeout, wait for a larger maximum
number of events. On shutdown, we will invoke io_destroy(), which
should lead to the io_getevents system call reporting EINVAL.
my_getevents(): Reimplement the libaio io_getevents() by only invoking
the system call. The library implementation would try to elide the
system call and return 0 immediately if aio_ring_is_empty() holds.
Here, we do want a blocking system call, not 100% CPU usage. Neither
do we want the aio_ring_is_empty() trigger SIGSEGV because it is
dereferencing some memory that was freed by io_destroy().
The greedy fetch_add(1) approach of read_trylock() may cause
starvation of a waiting write lock request. Let us use a
compare-and-swap for the read lock acquisition in order to
guarantee the progress of writers.
We always defined PFS_SKIP_BUFFER_MUTEX_RWLOCK, that is,
the latches of the buffer pool blocks were never instrumented
in PERFORMANCE_SCHEMA.
For some reason, the debug_latch (which enforce proper usage of
buffer-fixing in debug builds) was instrumented.
In commit bf3c862faa we introduced
an assertion that may dereference a null pointer.
This regression was caught by running the following:
./mtr --parallel=auto --suite=innodb \
--mysqld=--loose-innodb-adaptive-hash-index
The adaptive hash index is disabled by default since
commit 88cdfc5c7d (MDEV-20487)
and hence the problem was not caught earlier.
The test seems to deterministically fail on RelWithDebInfo builds
due to a timeout in wait_condition.inc.
According to Matthias Leich (the original author of the test),
the failure rate would reduce if we disabled the purge of
transaction history by setting innodb_force_recovery=2.
For now, let us run this stress test on debug builds only.
fil_space_t::flush_low(): Define and declare without inline.
ut_is_2pow(): Remove UNIV_LIKELY. This is almost exclusively
used in debug assertions. UNIV_LIKELY is not compatible with
static_assert in some compilers.
pipeline in community BB
Fix for rebuild from source step
Disable MCS on i386|i686 platforms
This patch puts MCS debian packaging files and part of debian/control
into the engine directory
When MDEV-19544 (commit 1a6f470464)
simplified the initialization of the local variable
set_also_gap_locks, an inadvertent change was included.
Essentially, all code branches that are executed when
set_also_gap_locks hold must also ensure that
trx->isolation_level > TRX_ISO_READ_COMMITTED holds.
This was being violated in a few code paths.
It turns out that there is an even simpler fix: Remove the test
of thd_is_select() completely. In that way, the first part of
UPDATE or DELETE should work exactly like SELECT...FOR UPDATE.
thd_is_select(): Remove.
Starting with commit 7cffb5f6e8 (MDEV-23399)
the function buf_flush_page() will first acquire block->lock and only
after that invoke set_io_fix(). Before that, it was possible to reach
a livelock between buf_page_create() and buf_flush_page().
buf_page_create(): Directly try acquiring the exclusive page latch
without checking whether the page is io-fixed or buffer-fixed.
(As a matter of fact, the have_x_latch() check is not strictly necessary,
because we still support recursive X-latches.)
In case of a latch conflict, wait while allowing buf_page_write_complete()
to acquire buf_pool.mutex and release the block->lock.
An attempt to wait for exclusive block->lock while holding buf_pool.mutex
would lead to a hang in the tests parts.part_supported_sql_func_innodb
and stress.ddl_innodb, due to a deadlock between buf_page_write_complete()
and buf_page_create().
Similarly, in case of an I/O fixed compressed-only
ROW_FORMAT=COMPRESSED page, we will sleep before retrying.
In both cases, we will sleep for 1ms or until a flush batch is completed.
The data member tv_usec of the struct timeval is declared as suseconds_t
on MacOS. Size of suseconds_t is 4 bytes. On the other hand, size of ulong
is 8 bytes on 64-bit MacOS, so attempt to assign a value of wider type
(usec) to a value (tv_usec) of narrower type leads to error.
This allows MariaDB to compile on old (limits to >2.6.32)
linux kernel versions.
This warns that attempts to use large pages will rely on
implict kernel determination.
This reverts commit 6cf8f05fd9.
Original patch assumed that MAP_HUGETLB as consistent across
achitectures which isn't the case. Defining it unconditionally
broke large pages on every achitecutre where the value differed
from x86_64.
With the EOL for Centos/RHEL6 announced in 10.5.7, <3.8 linux
kernels are no longer supported.
Add a new privilege "SLAVE MONITOR" which will grant user the permission
to execute "SHOW SLAVE STATUS" and "SHOW RELAYLOG EVENTS" commands.
SHOW SLAVE STATUS requires either SLAVE MONITOR/SUPER
SHOW RELAYLOG EVENTS requires SLAVE MONITOR privilege.