buf_page_print(): Remove the parameter 'flags',
and when a server abort is intended, perform that in the caller.
In this way, page corruption reports due to different reasons
can be distinguished better.
This is non-functional code refactoring that does not fix any
page corruption issues. The change is only made to avoid falsely
grouping together unrelated causes of page corruption.
When a slow shutdown is performed soon after spawning some work for
background threads that can create or commit transactions, it is possible
that new transactions are started or committed after the purge has finished.
This is violating the specification of innodb_fast_shutdown=0, namely that
the purge must be completed. (None of the history of the recent transactions
would be purged.)
Also, it is possible that the purge threads would exit in slow shutdown
while there exist active transactions, such as recovered incomplete
transactions that are being rolled back. Thus, the slow shutdown could
fail to purge some undo log that becomes purgeable after the transaction
commit or rollback.
srv_undo_sources: A flag that indicates if undo log can be generated
or the persistent, whether by background threads or by user SQL.
Even when this flag is clear, active transactions that already exist
in the system may be committed or rolled back.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Do not return an error code; the operation never fails.
Clear the srv_undo_sources flag, and also ensure that the background
DROP TABLE queue is empty.
srv_purge_should_exit(): Do not allow the purge to exit if
srv_undo_sources are active or the background DROP TABLE queue is not
empty, or in slow shutdown, if any active transactions exist
(and are being rolled back).
srv_purge_coordinator_thread(): Remove some previous workarounds
for this bug.
innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
and srv_dict_stats_thread_active directly. Set srv_undo_sources before
starting the purge subsystem, to prevent immediate shutdown of the purge.
Create dict_stats_thread and fts_optimize_thread immediately
after setting srv_undo_sources, so that shutdown can use this flag to
determine if these subsystems were started.
dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
srv_shutdown_table_bg_threads(): Remove (unused).
This merge reverts commit 6ca4f693c1ce472e2b1bf7392607c2d1124b4293
from current 5.6.36 innodb.
Bug #23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE
INDEX APPLIED BY UNCOMMITTED ROW
Problem:
========
row_search_for_mysql() does whole table traversal for range query
even though the end range is passed. Whole table traversal happens
when the record is not with in transaction read view.
Solution:
=========
Convert the innodb last record of page to mysql format and compare
with end range if the traversal of row_search_mvcc() exceeds 100,
no ICP involved. If it is out of range then InnoDB can avoid the
whole table traversal. Need to refactor the code little bit to
make it compile.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Knut Hatlen <knut.hatlen@oracle.com>
Reviewed-by: Dmitry Shulga <dmitry.shulga@oracle.com>
RB: 14660
The macro UT_LIST_INIT() zero-initializes the UT_LIST_NODE.
There is no need to call this macro on a buffer that has
already been zero-initialized by mem_zalloc() or mem_heap_zalloc()
or similar.
For some reason, the statement UT_LIST_INIT(srv_sys->tasks) in
srv_init() caused a SIGSEGV on server startup when compiling with
GCC 7.1.0 for AMD64 using -O3. The zero-initialization was attempted
by the instruction movaps %xmm0,0x50(%rax), while the proper offset
of srv_sys->tasks would seem to have been 0x48.
There is a race condition related to the variable
srv_stats.n_lock_wait_current_count, which is only
incremented and decremented by the function lock_wait_suspend_thread(),
The incrementing is protected by lock_sys->wait_mutex, but the
decrementing does not appear to be protected by anything.
This mismatch could allow the counter to be corrupted when a
transactional InnoDB table or record lock wait is terminating
roughly at the same time with the start of a wait on a
(possibly different) lock.
ib_counter_t: Remove some unused methods. Prevent instantiation for N=1.
Add an inc() method that takes a slot index as a parameter.
single_indexer_t: Remove.
simple_counter<typename Type, bool atomic=false>: A new counter wrapper.
Optionally use atomic memory operations for modifying the counter.
Aligned to the cache line size.
lsn_ctr_1_t, ulint_ctr_1_t, int64_ctr_1_t: Define as simple_counter<Type>.
These counters are either only incremented (and we do not care about
losing some increment operations), or the increment/decrement operations
are protected by some mutex.
srv_stats_t::os_log_pending_writes: Document that the number is protected
by log_sys->mutex.
srv_stats_t::n_lock_wait_current_count: Use simple_counter<ulint, true>,
that is, atomic inc() and dec() operations.
lock_wait_suspend_thread(): Release the mutexes before incrementing
the counters. Avoid acquiring the lock mutex if the lock wait has
already been resolved. Atomically increment and decrement
srv_stats.n_lock_wait_current_count.
row_insert_for_mysql(), row_update_for_mysql(),
row_update_cascade_for_mysql(): Use the inc() method with the trx->id
as the slot index. This is a non-functional change, just using
inc() instead of add(1).
buf_LRU_get_free_block(): Replace the method add(index, n) with inc().
There is no slot index in the simple_counter.
Allow 64-bit atomic operations on 32-bit systems,
only relying on HAVE_ATOMIC_BUILTINS_64, disregarding
the width of the register file.
Define UNIV_WORD_SIZE correctly on all systems, including Windows.
In MariaDB 10.0 and 10.1, it was incorrectly defined as 4 on
64-bit Windows.
Define HAVE_ATOMIC_BUILTINS_64 on Windows
(64-bit atomics are available on both 32-bit and 64-bit Windows
platforms; the operations were unnecessarily disabled even on
64-bit Windows).
MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES: Enable by default.
os_file_n_pending_preads, os_file_n_pending_pwrites,
os_n_pending_reads, os_n_pending_writes: Remove.
Use the monitor counters instead.
os_file_count_mutex: Remove. On a system that does not support
64-bit atomics, monitor_mutex will be used instead.
The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
call in srv_purge_coordinator_suspend() is protected by that X-latch.
It would seem a good idea to consistently protect both os_event_set()
and os_event_reset() calls with a common mutex or rw-lock in those
cases where os_event_set() and os_event_reset() are used
like condition variables, tied to changes of shared state.
For each os_event_t, we try to document the mutex or rw-lock that is
being used. For some events, frequent calls to os_event_set() seem to
try to avoid hangs. Some events are never waited for infinitely, only
timed waits, and os_event_set() is used for early termination of these
waits.
os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
on other systems than Windows. TODO: remove this altogether and disable
innodb_use_native_aio on Windows.
os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
we can see from the hang stacktrace, srv_monitor_thread is blocked
when getting log_sys::mutex, so that sync_arr_wake_threads_if_sema_free
cannot get a change to break the mutex deadlock.
The fix is simply removing any mutex wait in srv_monitor_thread.
Patch is reviewed by Sunny over IM.
Backport pull request #125 from grooverdan/MDEV-8923_innodb_buffer_pool_dump_pct to 10.0
WL#6504 InnoDB buffer pool dump/load enchantments
This patch consists of two parts:
1. Dump only the hottest N% of the buffer pool(s)
2. Prevent hogging the server duing BP load
From MySQL - commit b409342c43ce2edb68807100a77001367c7e6b8e
Add testcases for innodb_buffer_pool_dump_pct_basic.
Part of the code authored by Daniel Black
TO USE A SECOND WATCH PAGE PER INSTANCE
Description:
BUF_POOL_WATCH_SIZE is also initialized to number of purge threads.
so BUF_POOL_WATCH_SIZE will never be lesser than number of purge threads.
From the code, there is no scope for purge thread to skip buf_pool_watch_unset.
So there can be at most one buffer pool watch active per purge thread.
In other words, there is no chance for purge thread instance to hold a watch
when setting another watch.
Solution:
Adding code comments to clarify the issue.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Approved via Bug page.
This patch ports the work that facebook has performed
to make innochecksum handle compressed tables.
the basic idea is to use actual innodb-code to perform
checksum verification rather than duplicating in innochecksum.cc.
to make this work, innodb code has been annotated with
lots of #ifndef UNIV_INNOCHECKSUM so that it can be
compiled outside of storage/innobase.
A new testcase is also added that verifies that innochecksum
works on compressed/non-compressed tables.
Merged from commit fabc79d2ea976c4ff5b79bfe913e6bc03ef69d42
from https://code.google.com/p/google-mysql/
The actual steps to produce this patch are:
take innochecksum from 5.6.14
apply changes in innodb from facebook patches needed to make innochecksum compile
apply changes in innochecksum from facebook patches
add handcrafted testcase
The referenced facebook patches used are:
91e25120e7847fe76ea51135628a5a4dbf7c240c
Analysis: InnoDB error monitor is responsible to call every second
sync_arr_wake_threads_if_sema_free() to wake up possible hanging
threads if they are missed in mutex_signal_object. This is not
possible if error monitor itself is on mutex/semaphore wait. We
should avoid all unnecessary mutex/semaphore waits on error monitor.
Currently error monitor calls function buf_flush_stat_update()
that calls log_get_lsn() function and there we will try to get
log_sys mutex. Better, solution for error monitor is that in
buf_flush_stat_update() we will try to get lsn with
mutex_enter_nowait() and if we did not get mutex do not update
the stats.
Fix: Use log_get_lsn_nowait() function on buf_flush_stat_update()
function. If returned lsn is 0, we do not update flush stats.
log_get_lsn_nowait() will use mutex_enter_nowait() and if
we get mutex we return a correct lsn if not we return 0.
Analysis: InnoDB error monitor is responsible to call every second
sync_arr_wake_threads_if_sema_free() to wake up possible hanging
threads if they are missed in mutex_signal_object. This is not
possible if error monitor itself is on mutex/semaphore wait. We
should avoid all unnecessary mutex/semaphore waits on error monitor.
Currently error monitor calls function buf_flush_stat_update()
that calls log_get_lsn() function and there we will try to get
log_sys mutex. Better, solution for error monitor is that in
buf_flush_stat_update() we will try to get lsn with
mutex_enter_nowait() and if we did not get mutex do not update
the stats.
Fix: Use log_get_lsn_nowait() function on buf_flush_stat_update()
function. If returned lsn is 0, we do not update flush stats.
log_get_lsn_nowait() will use mutex_enter_nowait() and if
we get mutex we return a correct lsn if not we return 0.
Merged Facebook commit ecff018632c6db49bad73d9233c3cdc9f41430e9
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
This change is to fix: http://bugs.mysql.com/62534
This makes innodb_max_dirty_pages_pct a double with min,default,max values
0.001, 75, 99.999.
This also makes innodb_max_dirty_pages_pct_lwm and adaptive_flushing_lwm
doubles, as these sysvars are inter-dependent.
Added more to the BUFFER POOL AND MEMORY section of SHOW INNODB STATUS:
Percent pages dirty: X.X
This is all n_dirty_pages / used_pages
Percent all pages dirty: X.X
This is all n_dirty_pages / all-pages
Max dirty pages percent: X.X
This is innodb_max_dirty_pages_pct
Also changed all of buf from 2 to 3 digits of precision (%.2f -> %.3f).
compressed pages
buf_flush_LRU() returns the number of pages processed. There are
two types of processing that can happen. A page can get evicted or
a page can get flushed. These two numbers are quite distinct and
should not be mixed.
Merge Facebook commit 926a077b14b73c14094de7fc7aa913241b801b4d
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.
This is fix for upstream bugs
http://bugs.mysql.com/bug.php?id=71988http://bugs.mysql.com/bug.php?id=70500
page_cleaner should work whether or not there is server activity.
Its iterations become a noop when there is no work to do but we
should not tie it to the server activity.
The page_cleaner thread does spurious background flushing
because of conditional sleep between iterations. The solution
is not to make sleep dependent on server activity etc.
Merged Facebooks commit 6e06bbfa315ffb97d713dd6e672d6054036ddc21
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.
Fixes MySQL bug http://bugs.mysql.com/bug.php?id=72123
lock_timeout thread works in a tight loop waking up every second
and checking for lock_wait_timeout. In addition, when a mysql
thread is forced to wait on a lock, it signals the lock_timeout thread
as well. This call is not required. In a heavily contended workload
each thread going to wait will signal the lock_timeout thread making
it work all the time. As lock_timeout thread scans the array of
waiting threads under lock_sys::wait_mutex which is already very
hot in contneded loads, these extra scans can cause significanct
performance regression.
Also, in various codepaths lock_timeout thread is signalled where
actual intention was to signal the innodb monitor thread.
IN RECOVERY
During redo log processing, the data dictionary is not available. We should
check it in dict_find_table_by_space() to prevent SEGV error.
rb#5678, approved by Jimmy.
IN RECOVERY
During redo log processing, the data dictionary is not available. We should
check it in dict_find_table_by_space() to prevent SEGV error.
rb#5678, approved by Jimmy.