Analysis: InnoDB error monitor is responsible to call every second
sync_arr_wake_threads_if_sema_free() to wake up possible hanging
threads if they are missed in mutex_signal_object. This is not
possible if error monitor itself is on mutex/semaphore wait. We
should avoid all unnecessary mutex/semaphore waits on error monitor.
Currently error monitor calls function buf_flush_stat_update()
that calls log_get_lsn() function and there we will try to get
log_sys mutex. Better, solution for error monitor is that in
buf_flush_stat_update() we will try to get lsn with
mutex_enter_nowait() and if we did not get mutex do not update
the stats.
Fix: Use log_get_lsn_nowait() function on buf_flush_stat_update()
function. If returned lsn is 0, we do not update flush stats.
log_get_lsn_nowait() will use mutex_enter_nowait() and if
we get mutex we return a correct lsn if not we return 0.
Merged Facebook commit 617aef9f911d825e9053f3d611d0389e02031225
authored by Inaam Rana to InnoDB storage engine (not XtraDB)
from https://github.com/facebook/mysql-5.6
WL#7047 - Optimize buffer pool list scans and related batch processing
Reduce excessive scanning of pages when doing flush list batches. The
fix is to introduce the concept of "Hazard Pointer", this reduces the
time complexity of the scan from O(n*n) to O.
The concept of hazard pointer is reversed in this work. Academically
hazard pointer is a pointer that the thread working on it will declar
such and as long as that thread is not done no other thread is allowe
do anything with it.
In this WL we declare the pointer as a hazard pointer and then if any
thread attempts to work on it, it is allowed to do so but it has to a
the hazard pointer to the next valid value. We use hazard pointer sol
reverse traversal of lists within a buffer pool instance.
Add an event to control the background flush thread. The background f
thread wait has been converted to an os event timed wait so that it c
signalled by threads that want to kick start a background flush when
buffer pool is running low on free/dirty pages.
Merge Facebook commit 4f3e0343fd2ac3fc7311d0ec9739a8f668274f0d
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
Adds innodb_idle_flush_pct to enable tuning of the page flushing rate
when the system is relatively idle. We care about this, since doing
extra unnecessary flash writes shortens the lifespan of the flash.
Merged Facebook commit ecff018632c6db49bad73d9233c3cdc9f41430e9
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
This change is to fix: http://bugs.mysql.com/62534
This makes innodb_max_dirty_pages_pct a double with min,default,max values
0.001, 75, 99.999.
This also makes innodb_max_dirty_pages_pct_lwm and adaptive_flushing_lwm
doubles, as these sysvars are inter-dependent.
Added more to the BUFFER POOL AND MEMORY section of SHOW INNODB STATUS:
Percent pages dirty: X.X
This is all n_dirty_pages / used_pages
Percent all pages dirty: X.X
This is all n_dirty_pages / all-pages
Max dirty pages percent: X.X
This is innodb_max_dirty_pages_pct
Also changed all of buf from 2 to 3 digits of precision (%.2f -> %.3f).
Merge Facebook commit 926a077b14b73c14094de7fc7aa913241b801b4d
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.
This is fix for upstream bugs
http://bugs.mysql.com/bug.php?id=71988http://bugs.mysql.com/bug.php?id=70500
page_cleaner should work whether or not there is server activity.
Its iterations become a noop when there is no work to do but we
should not tie it to the server activity.
The page_cleaner thread does spurious background flushing
because of conditional sleep between iterations. The solution
is not to make sleep dependent on server activity etc.
Merged Facebooks commit 6e06bbfa315ffb97d713dd6e672d6054036ddc21
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.
Fixes MySQL bug http://bugs.mysql.com/bug.php?id=72123
lock_timeout thread works in a tight loop waking up every second
and checking for lock_wait_timeout. In addition, when a mysql
thread is forced to wait on a lock, it signals the lock_timeout thread
as well. This call is not required. In a heavily contended workload
each thread going to wait will signal the lock_timeout thread making
it work all the time. As lock_timeout thread scans the array of
waiting threads under lock_sys::wait_mutex which is already very
hot in contneded loads, these extra scans can cause significanct
performance regression.
Also, in various codepaths lock_timeout thread is signalled where
actual intention was to signal the innodb monitor thread.
buf0buf.cc line 2642.
Analysis: innodb_compression_algorithm is a global variable and
can change while we are building page compressed page. This could
lead page corruption.
Fix: Cache innodb_compression_algorithm on local variable before
doing any compression or page formating to avoid concurrent
change. Improved page verification on debug builds.
InnoDB: Failing assertion: mutex_own(&buf_pool->LRU_list_mutex)
and
InnoDB: Assertion failure in thread 2868898624 in file buf0lru.c line 1077
InnoDB: Failing assertion: mutex_own(&buf_pool->LRU_list_mutex)
Analysis: Function buf_LRU_free_block might release LRU_list_mutex on
same cases to avoid mutex order problems, we need to take it back
before accessing list.
Analysis: InnoDB writes also files that do not contain FIL-header.
This could lead incorrect analysis on os_fil_read_func function
when it tries to see is page page compressed based on FIL_PAGE_TYPE
field on FIL-header. With bad luck uncompressed page that does
not contain FIL-headed, the byte on FIL_PAGE_TYPE position could
indicate that page is page comrpessed.
Fix: Upper layer must indicate is file space page compressed
or not. If this is not yet known, we need to read the FIL-header
and find it out. Files that we know that are not page compressed
we can always just provide FALSE.
config.h.cmake:
define NOMINMAX, otherwise Windows system headers define min() and max() macros
sql/slave.cc:
mi->report() has one more argument in MariaDB
storage/xtradb/buf/buf0flu.cc:
xtradb fixes for windows, again
buf0flu.cc line 549.
Analysis: If buf_page_get_state(bpage) == BUF_BLOCK_REMOVE_HASH then
buf_page_in_file(bpage) might not be true.
Fix: ut_a(buf_page_in_file(bpage) || buf_page_get_state(bpage) == BUF_BLOCK_REMOVE_HASH);
4229: MDEV-5670: Assertion failure in file buf0lru.c line 2355
Add more status information if repeatable.
4230: MDEV-5673: Crash while parallel dropping multiple tables under heavy load
Improve long semaphore wait output to include all semaphore waits
and try to find out if there is a sequence of waiters.
4233: Fix compiler errors on product build.
4237: Fix too agressive long semaphore wait output and add guard against introducing
compression failures on insert buffer.
4238: Fix test failure caused by simulated compression failure on
IBUF_DUMMY table.
in file buf0mtflu.cc line 570.
Analysis: Real timing bug, we should take the mutex before we
try to send those shutdown messages, that would make sure
that threads doing a unfinished flush (they have acquired
this mutex) have time to do their work before we add shutdown
messages to work queue. Currently, we just add those shutdown
messages to work queue and code assumes that at flush, there
is constant number of items to be processed and thus
leading to assertion.
Analysis: Based on crashed the buffer pool instance identifier is
not correct on block to be freed. Add LRU list mutex holding
on functions calling free and add additional safety checks.
(Inadequate background LRU flushing for write workloads with InnoDB compression).
If InnoDB compression is used and the workload has writes, the
following situation is possible. The LRU flusher issues an LRU flush
request for an instance. buf_do_LRU_batch decides to perform
unzip_LRU eviction and this eviction might fully satisfy the
request. Then buf_flush_LRU_tail checks the number of flushed pages in
the last iteration, finds it to be zero, and wrongly decides not to
flush that instance anymore.
Fixed by maintaining unzip_LRU eviction counter in struct
flush_counter_t variables, and checking it in buf_flush_LRU_tail when
deciding whether to stop flushing the current instance.
Added test cases for new configuration files to get mysql-test-run suite sys_vars
to pass. Fix some small errors.
special builds UNIV_PAGECOMPRESS_DEBUG and UNIV_MTFLUSH_DEBUG).
Added a new status variable compress_pages_page_compression_error to count possible
compression errors.
when compressed tables are used.
Analysis: Number of flushed pages is incorrectly calculated at
buf_do_LRU_batch. This leads to problem when utility function
flushes dirty blocks from the end of the flush list of
all buffer pool instances in a loop until enough pages are flushed
or time limit is reached. As number of flushed pages is incorrectly
calculated, the loop mostly try to flush until time limit is
reached because the number of pages limit is not reached.
Fix: Fix the calculation of flushed pages (very short). This fix
was provided by Alexey Stroganov (Percona).
Analysis: XtraDB merge regression, at the end of mutex_spin_wait before goto mutex_loop
there is missing
if (prio_mutex) {
os_atomic_decrement_ulint(&prio_mutex->high_priority_waiters, 1);
}
Hence we get unbalanced waiter count.
Thanks to Laurynas Biveinis for finding this.
- Table locks now ends with state "After table lock"
- Open table now ends with state "After opening tables"
- All calls to close_thread_tables(), not only from mysql_execute_command(), has state "closing tables"
- Added state "executing" for mysql admin commands, like CACHE INDEX, REPAIR TABLE etc.
- Added state "Finding key cache" for CACHE INDEX
- Added state "Filling schema table" when we generate temporary table for SHOW commands and information schema.
Other things:
Add limit from innobase for thread_sleep_delay. This fixed a failing tests case.
Added db.opt to support-files to make 'make package' work
mysql-test/suite/funcs_1/datadict/processlist_val.inc:
Use new state
mysql-test/suite/funcs_1/r/processlist_priv_no_prot.result:
Updated test result because of new state
mysql-test/suite/funcs_1/r/processlist_val_no_prot.result:
Updated test result because of new state
sql/CMakeLists.txt:
Have option files in support-files
sql/lock.cc:
Added new state 'After table lock'
sql/sql_admin.cc:
Added state "executing" and "Sending data" for mysql admin commands, like CACHE INDEX, REPAIR TABLE etc.
Added state "Finding key cache"
sql/sql_base.cc:
open tables now ends with state "After table lock", instead of NULL
sql/sql_parse.cc:
Moved state "closing tables" to close_thread_tables()
sql/sql_show.cc:
Added state "Filling schema table" when we generate temporary table for SHOW commands and information schema.
storage/xtradb/buf/buf0buf.c:
Removed compiler warning
storage/xtradb/handler/ha_innodb.cc:
Add limit from innobase for thread_sleep_delay. This fixed a failing tests case.
support-files/db.opt:
cmakes needs this to create data/test directory
Added option to disable multi-threaded flush with innodb_use_mtflush = 0
option, by default multi-threaded flush is used.
Updated innochecksum tool, still it does not support new checksums.
execution and global memory heap on shutdown.
Added a funcition to get work items from queue without waiting and
additional info when there is no work to do for a extended periods.
waiting for a job. They should sleep and let other threads to their work. At
shutdown, we know that we put "work" and that is handled as soon as possible.