Added a SESSION-only system variable "wsrep_dirty_reads" to allow SELECT
queries to pass even when the node is not prepared to accept queries
(wsrep_ready=OFF). Added a test case.
The problem was that the maximum value of the transaction_prealloc_size
session system variable was ULONG_MAX which meant that it was possible
to cause the server to allocate excessive amounts of memory.
This patch fixes the problem by reducing the maxmimum value of
transaction_prealloc_size and transaction_alloc_block_size down
to 128K.
Note that transactions will still be able to allocate more than
128K if needed, this patch just reduces the amount that can be
preallocated - as well as the maximum size of the incremental
allocation blocks.
Added a SESSION-only system variable "wsrep_dirty_reads" to allow SELECT
queries to pass even when the node is not prepared to accept queries
(wsrep_ready=OFF). Added a test case.
log-tc-size is 24K by default. Page size is 64K on PPC64. But log-tc-size
must be at least 3 x page size. This is enforced by TC_LOG_MMAP::open()
with a comment: to guarantee non-empty pool.
This all makes server not startable in default configuration on PPC64.
Autosize log-tc-size, so that it's min value= page size * 3, default
value= page size * 6, block size= page size.
Implement --semi-sync-master-wait-point=AFTER_SYNC|AFTER_COMMIT.
When AFTER_SYNC, the semi-sync wait will be done earlier, before the storage
engine commit rather than after. This means that a transaction will not be
visible on the master until at least one slave has received it.
Implement a new mode for parallel replication. In this mode, all transactions
are optimistically attempted applied in parallel. In case of conflicts, the
offending transaction is rolled back and retried later non-parallel.
This is an early-release patch to facilitate testing, more changes to user
interface / options will be expected. The new mode is not enabled by default.
- Changed 0x%lx -> %p
array.c:
- Static (preallocated) buffer can now be anywhere
my_sys.h
- Define MY_INIT_BUFFER_USED
sql_delete.cc & sql_lex.cc
- Use memroot when allocating classes (avoids call to current_thd)
sql_explain.h:
- Use preallocated buffers
sql_explain.cc:
- Use preallocated buffers and memroot
sql_select.cc:
- Use multi_alloc_root() instead of many alloc_root()
- Update calls to Explain
so that a simple query with one join would not have to call my_malloc.
- Allow lower limites for query_prealloc_size for testing.
- Fixed wrong initialization of trans_alloc_block_size
innodb_stats_sample_pages
Analysis: If you set the number of analyzed pages
to very low number compared to actual pages on
that table/index it randomly pics those pages
(default 8 pages), this leads to fact that query
after analyze table returns different results. If
the index tree is small, smaller than 10 *
n_sample_pages + total_external_size, then the
estimate is ok. For bigger index trees it is
common that we do not see any borders between
key values in the few pages we pick. But still
there may be n_sample_pages different key values,
or even more. And it just tries to
approximate to n_sample_pages (8).
Fix: (1) Introduced new dynamic configuration variable
innodb_stats_sample_traditional that retains
the current design. Default false.
(2) If traditional sample is not used we use
n_sample_pages = max(min(srv_stats_sample_pages,
index->stat_index_size),
log2(index->stat_index_size)*
srv_stats_sample_pages);
(3) Introduced new dynamic configuration variable
stat_modified_counter (default = 0) if set
sets lower bound for row updates when statistics is re-estimated.
If user has provided upper bound for how many rows needs to be updated
before we calculate new statistics we use minimum of provided value
and 1/16 of table every 16th round. If no upper bound is provided
(srv_stats_modified_counter = 0, default) then calculate new statistics
if 1 / 16 of table has been modified
since the last time a statistics batch was run.
We calculate statistics at most every 16th round, since we may have
a counter table which is very small and updated very often.
@param t table
@return true if the table has changed too much and stats need to be
recalculated
*/
#define DICT_TABLE_CHANGED_TOO_MUCH(t) \
((ib_int64_t) (t)->stat_modified_counter > (srv_stats_modified_counter ? \
ut_min(srv_stats_modified_counter, (16 + (t)->stat_n_rows / 16)) : \
16 + (t)->stat_n_rows / 16))
use the same restriction for character_set_client on the command line
and from SQL.
Also: remove strange hack from thd_init_client_charset() that contradicted
the manual (collation_connection and character_set_result were not always set)
Merge Facebook commit cd063ab930
authored by Peng Tian from https://github.com/facebook/mysql-5.6
Introduced a new configuration variable innodb_fatal_semaphore_wait_threshold,
it makes the fatal semaphore timeout configurable. Modified original commit
so that no MariaDB server files are changed, instead introduced a new
InnoDB/XtraDB configuration variable.
Its default/min/max vlaues are 600/1/2^32-1 in seconds (it was hardcoded
as 600, now its default value is 600, so the default behavior of this diff
should be no change).
setting of innodb_io_capacity_max
(a) Changed the behaviour so that if you set innodb_io_capacity to a
value > innodb_io_capacity_max that the value is accepted AND
that innodb_io_capacity_max = innodb_io_capacity * 2.
(b) If someone wants to reduce innodb_io_capacity_max and
reduce it below innodb_io_capacity then innodb_io_capacity
should be reduced to the same level as innodb_io_capacity_max.
In both cases give a warning to user.
Merged Facebook commit 617aef9f911d825e9053f3d611d0389e02031225
authored by Inaam Rana to InnoDB storage engine (not XtraDB)
from https://github.com/facebook/mysql-5.6
WL#7047 - Optimize buffer pool list scans and related batch processing
Reduce excessive scanning of pages when doing flush list batches. The
fix is to introduce the concept of "Hazard Pointer", this reduces the
time complexity of the scan from O(n*n) to O.
The concept of hazard pointer is reversed in this work. Academically
hazard pointer is a pointer that the thread working on it will declar
such and as long as that thread is not done no other thread is allowe
do anything with it.
In this WL we declare the pointer as a hazard pointer and then if any
thread attempts to work on it, it is allowed to do so but it has to a
the hazard pointer to the next valid value. We use hazard pointer sol
reverse traversal of lists within a buffer pool instance.
Add an event to control the background flush thread. The background f
thread wait has been converted to an os event timed wait so that it c
signalled by threads that want to kick start a background flush when
buffer pool is running low on free/dirty pages.
Merge Facebook commit 154c579b828a60722a7d9477fc61868c07453d08
and e8f0052f9b112dc786bf9b957ed5b16a5749f7fd authored
by Steaphan Greene from https://github.com/facebook/mysql-5.6
Optimize prefix index queries to skip cluster index lookup when possible.
Currently InnoDB will always fetch the clustered index (primary key
index) for all prefix columns in an index, even when the value of a
particular record is smaller than the prefix length. This change
optimizes that case to use the record from the secondary index and avoid
the extra lookup.
Also adds two status vars that track how effective this is:
innodb_secondary_index_triggered_cluster_reads:
Times secondary index lookup triggered cluster lookup.
innodb_secondary_index_triggered_cluster_reads_avoided:
Times prefix optimization avoided triggering cluster lookup.
Merge Facebook commit 4f3e0343fd2ac3fc7311d0ec9739a8f668274f0d
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
Adds innodb_idle_flush_pct to enable tuning of the page flushing rate
when the system is relatively idle. We care about this, since doing
extra unnecessary flash writes shortens the lifespan of the flash.
Merge Facebook commit ca40b4417fd224a68de6636b58c92f133703fc68
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
Change the default value for innodb_log_compressed_pages to false
Logging these pages is a waste. We don't want this to be enabled.
One caution here: If the zlib version used by innodb is changed, but
the running version is still the previous version, and the running
version crashes, it is possible crash recovery could fail.
When crash recovery uses a zlib version at all different than the
version used by the crashed instance, it is possible that a redone
compression could fail, where the original did not, because the new
zlib version compresses the same data to a slightly larger size.
Because of the nature of compression, this is even possible when
upgrading to a version of zlib which actually peforms overall better
compression than the previous version.
If this happens, mysql will fail to recover, since a page split can
not be safely triggered during crash recovery.
So, either the exact zlib version must be controlled between builds,
or these rare recovery failures must be accepted. The cost of
logging these pages is quite high, so we consider this limitation to
be worthwhile.
This failure scenario can not happen if there was a clean shutdown.
This is only relevant to restarting crashed instances, or starting an
instance built via a hot backup too (XtraBackup).
New generation hard drives, SSDs and NVM devices support 4K
sector size. Supported sector size can be found using fstatvfs()
or GetDiskFreeSpace() functions.
Merged Facebook commit dd2d11be7aaf3be270e740fb95cbc4eacb52f4d7
authored by Rongrong Zhong from https://github.com/facebook/mysql-5.6
This fixes MySQL Bug #68220 innodb_rows_updated is misleading on slave
http://bugs.mysql.com/bug.php?id=68220
Added innodb_system_rows_read/inserted/updated/deleted counters
that are the equivalent of innodb_rows_* but that only account for
changes made to system databases (mysql, information_schame and
preformance_schema). These counters will be used on slaves to
differentiated the updates made on system databases from those made on
user databases.
innodb_rows_* status counters are not updated when innodb_system_rows_*
are updated.
dd2d11be7a
Merged Facebook commit ecff018632c6db49bad73d9233c3cdc9f41430e9
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
This change is to fix: http://bugs.mysql.com/62534
This makes innodb_max_dirty_pages_pct a double with min,default,max values
0.001, 75, 99.999.
This also makes innodb_max_dirty_pages_pct_lwm and adaptive_flushing_lwm
doubles, as these sysvars are inter-dependent.
Added more to the BUFFER POOL AND MEMORY section of SHOW INNODB STATUS:
Percent pages dirty: X.X
This is all n_dirty_pages / used_pages
Percent all pages dirty: X.X
This is all n_dirty_pages / all-pages
Max dirty pages percent: X.X
This is innodb_max_dirty_pages_pct
Also changed all of buf from 2 to 3 digits of precision (%.2f -> %.3f).
Extend existing plugins to support
* SHOW QUERY_RESPONSE_TIME
* FLUSH QUERY_RESPONSE_TIME
* SHOW LOCALE
move userstat tables to use the new API instead of
hand-coded syntax