Commit graph

3562 commits

Author SHA1 Message Date
Jan Lindström
0fe39b6856 MDEV-7262: innodb.innodb-mdev7046 and innodb-page_compression* fail on BuildBot
If persistent trim is not used some OS require that we write full page.
2014-12-04 12:40:19 +02:00
Jan Lindström
24a6b41348 Move page initialization to better place. 2014-12-03 13:53:11 +02:00
Sergei Golubchik
ec4137c62b Merge branch '10.1' into bb-10.1-merge 2014-12-03 11:37:26 +01:00
Jan Lindström
d4aef382fd Fix compiler failure on fallocate function and used flags. 2014-12-03 10:41:52 +02:00
Sergei Golubchik
853077ad7e Merge branch '10.0' into bb-10.1-merge
Conflicts:
	.bzrignore
	VERSION
	cmake/plugin.cmake
	debian/dist/Debian/control
	debian/dist/Ubuntu/control
	mysql-test/r/join_outer.result
	mysql-test/r/join_outer_jcl6.result
	mysql-test/r/null.result
	mysql-test/r/old-mode.result
	mysql-test/r/union.result
	mysql-test/t/join_outer.test
	mysql-test/t/null.test
	mysql-test/t/old-mode.test
	mysql-test/t/union.test
	packaging/rpm-oel/mysql.spec.in
	scripts/mysql_config.sh
	sql/ha_ndbcluster.cc
	sql/ha_ndbcluster_binlog.cc
	sql/ha_ndbcluster_cond.cc
	sql/item_cmpfunc.h
	sql/lock.cc
	sql/sql_select.cc
	sql/sql_show.cc
	sql/sql_update.cc
	sql/sql_yacc.yy
	storage/innobase/buf/buf0flu.cc
	storage/innobase/fil/fil0fil.cc
	storage/innobase/include/srv0srv.h
	storage/innobase/lock/lock0lock.cc
	storage/tokudb/CMakeLists.txt
	storage/xtradb/buf/buf0flu.cc
	storage/xtradb/fil/fil0fil.cc
	storage/xtradb/include/srv0srv.h
	storage/xtradb/lock/lock0lock.cc
	support-files/mysql.spec.sh
2014-12-02 22:25:16 +01:00
Jan Lindström
01590005ba Fix buildbot valgrind errors on test innodb.innodb-page_compression_tables
Problem was that temporal buffers allocated for page compression
are not initialized and rest of the page that is actually writen
was als not initialized after previous usage.
2014-12-02 19:25:58 +02:00
Sergey Vojtovich
ed313e8a92 MDEV-7148 - Recurring: InnoDB: Failing assertion: !lock->recursive
On PPC64 high-loaded server may crash due to assertion failure in InnoDB
rwlocks code.

This happened because load order between "recursive" and "writer_thread"
wasn't properly enforced.
2014-12-01 14:58:29 +04:00
Jan Lindström
89a3628b0b Better comments part 2 with proof and simplified implementation.
Thanks to Daniel Black.
2014-11-25 12:04:32 +02:00
Jan Lindström
e3ded84b83 Fix typo. 2014-11-25 08:22:10 +02:00
Jan Lindström
e5802c38f9 Better comments and add a test case. 2014-11-25 08:06:41 +02:00
Jan Lindström
afe6d88d78 MDEV-7167: innodb.innodb_bug12902967 fails in buildbot on Windows
Problem is that there is additional error message from function
that is not really needed now.
2014-11-24 21:29:12 +02:00
Jan Lindström
1ac12df0cb MDEV-7164: innodb.innodb-alter-table-disk-full fails in buildbot on Windows
Analysis: Test case uses Linux specific error codes.

Fix: Can't run test case with Windows currently because requires
to inject error to system.
2014-11-24 15:23:13 +02:00
Jan Lindström
1a05bb4010 MDEV-7166: innodb.innodb-page_compression_zip fails in buildbot
Analysis: If innodb_use_trim is not enabled or system does not
support fallocate to make persistent trim, we should always
write full page not only partial pages.
2014-11-24 12:00:42 +02:00
Jan Lindström
e15a83c0c7 Better comments part 2 with proof and simplified implementation.
Thanks to Daniel Black.
2014-11-26 16:41:28 +02:00
Jan Lindström
f3bdf9d741 MDEV-7046: MySQL#74480 - Failing assertion: os_file_status(newpath, &exists, &type)
after Operating system error number 36 in a file operation.

Analysis: os_file_get_status did not handle error ENAMETOOLONG
correctly.

Fix: Add correct handling for error ENAMETOOLONG. Note that on InnoDB
case the error is not passed all the way up to server. That would
be bigger rewamp.
2014-11-25 11:38:01 +02:00
Jan Lindström
b62c4c6586 Better comments and add a test case. 2014-11-25 08:31:03 +02:00
Jan Lindström
ef1ba3b1e6 MDEV-7164: innodb.innodb-alter-table-disk-full fails in buildbot on Windows
Analysis: Test case uses Linux specific error codes.

Fix: Can't run test case with Windows currently because requires
to inject error to system.
2014-11-24 15:26:47 +02:00
Sergei Golubchik
ffc0ef6316 5.5 merge 2014-11-21 20:20:39 +01:00
Sergey Vojtovich
4472a0ef95 MDEV-7026 - Race in InnoDB/XtraDB mutex implementation can stall or hang the
server

This is an addition to original patch. Added full memory barrier to ensure
proper StoreLoad order between waiters and lock_word on PPC64.
2014-11-21 15:23:18 +04:00
Jan Lindström
b0febdb66e MDEV-7084: innodb index stats inadequate using constant innodb_stats_sample_pages
Use traditional statistics estimation by default (innodb-stats-traditional=true).
There could be performance regression for customers if there is a lot of
open table operations.
2014-11-21 13:27:36 +02:00
Sergei Golubchik
a9a6bd5256 InnoDB 5.6.21 2014-11-20 16:59:22 +01:00
Sergei Golubchik
3c12c27907 5.5 merge 2014-11-20 16:07:34 +01:00
Sergei Golubchik
afca52bb52 5.5 merge 2014-11-20 15:26:31 +01:00
Jan Lindström
8bc5eabea8 MDEV-7084: innodb index stats inadequate using constant
innodb_stats_sample_pages

Analysis: If you set the number of analyzed pages 
to very low number compared to actual pages on 
that table/index it randomly pics those pages 
(default 8 pages), this leads to fact that query 
after analyze table returns different results. If 
the index tree is small, smaller than 10 * 
n_sample_pages + total_external_size, then the 
estimate is ok. For bigger index trees it is 
common that we do not see any borders between 
key values in the few pages we pick. But still 
there may be n_sample_pages different key values, 
or even more. And it just tries to 
approximate to n_sample_pages (8).

Fix: (1) Introduced new dynamic configuration variable
innodb_stats_sample_traditional  that retains
the current design. Default false.

(2) If traditional sample is not used we use
n_sample_pages = max(min(srv_stats_sample_pages,
                         index->stat_index_size),
                     log2(index->stat_index_size)*
                          srv_stats_sample_pages);

(3) Introduced new dynamic configuration variable
stat_modified_counter (default = 0) if set
sets lower bound for row updates when statistics is re-estimated.

If user has provided upper bound for how many rows needs to be updated
before we calculate new statistics we use minimum of provided value
and 1/16 of table every 16th round. If no upper bound is provided
(srv_stats_modified_counter = 0, default) then calculate new statistics
if 1 / 16 of table has been modified
since the last time a statistics batch was run.
We calculate statistics at most every 16th round, since we may have
a counter table which is very small and updated very often.
@param t table
@return true if the table has changed too much and stats need to be
recalculated
*/
#define DICT_TABLE_CHANGED_TOO_MUCH(t) \
	((ib_int64_t) (t)->stat_modified_counter > (srv_stats_modified_counter ? \
	ut_min(srv_stats_modified_counter, (16 + (t)->stat_n_rows / 16)) : \
		16 + (t)->stat_n_rows / 16))
2014-11-19 20:27:34 +02:00
Sergei Golubchik
3495801e2e 5.5 merge 2014-11-19 17:23:39 +01:00
Jan Lindström
b432c7bc42 MDEV-7133: InnoDB: Assertion failure in dict_tf_is_valid
Problem is that page compressed tables currently require atomic_blobs and
that feature is not availabe currently for row_format=redundant.

Fix: Disallow page compressed create option if table row_format=redundant.
2014-11-19 14:58:48 +02:00
Kristian Nielsen
6ea41f1e84 MDEV-7026: Race in InnoDB/XtraDB mutex implementation can stall or hang the server.
The bug was that full memory barrier was missing in the code that ensures that
a waiter on an InnoDB mutex will not go to sleep unless it is guaranteed to be
woken up again by another thread currently holding the mutex. This made
possible a race where a thread could get stuck waiting for a mutex that is in
fact no longer locked. If that thread was also holding other critical locks,
this could stall the entire server. There is an error monitor thread than can
break the stall, it runs about once per second. But if the error monitor
thread itself got stuck or was not running, then the entire server could hang
infinitely.

This was introduced on i386/amd64 platforms in 5.5.40 and 10.0.13 by an
incorrect patch that tried to fix the similar problem for PowerPC.

This commit reverts the incorrect PowerPC patch, and instead implements a fix
for PowerPC that does not change i386/amd64 behaviour, making PowerPC work
similarly to i386/amd64.
2014-11-19 13:56:46 +01:00
Sergei Golubchik
303eec5774 MDEV-6880 Can't define CURRENT_TIMESTAMP as default value for added column
ALTER TABLE: don't fill default values per row, do it once.
And do it in two places - for copy_data_between_tables() and for online ALTER.

Also, run function_defaults test both for MyISAM and for InnoDB.
2014-11-18 22:25:33 +01:00
Jan Lindström
7bf391c205 MDEV-7108: Make long semaphore wait timeout configurable
Merge Facebook commit cd063ab930
authored by Peng Tian from https://github.com/facebook/mysql-5.6

Introduced a new configuration variable innodb_fatal_semaphore_wait_threshold,
it makes the fatal semaphore timeout configurable. Modified original commit
so that no MariaDB server files are changed, instead introduced a new
InnoDB/XtraDB configuration variable.

Its default/min/max vlaues are 600/1/2^32-1 in seconds (it was hardcoded
as 600, now its default value is 600, so the default behavior of this diff
should be no change).
2014-11-17 09:59:52 +02:00
Jan Lindström
ea83226872 MDEV-7088: Query stats for compression based on TRIM size
Analysis: Status variables were missing from innodb_status_variables
array.

Fix: Add missing status variables to the array.
2014-11-12 15:37:52 +02:00
Jan Lindström
0f32299437 MDEV-7035: Remove innodb_io_capacity setting depending on
setting of innodb_io_capacity_max

(a) Changed the behaviour so that if you set innodb_io_capacity to a 
value > innodb_io_capacity_max that the value is accepted AND 
that innodb_io_capacity_max = innodb_io_capacity * 2.

(b) If someone wants to reduce innodb_io_capacity_max and 
reduce it below innodb_io_capacity then innodb_io_capacity 
should be reduced to the same level as innodb_io_capacity_max.

In both cases give a warning to user.
2014-11-13 13:24:26 +02:00
Jan Lindström
bff2d46bf7 MDEV-7100: InnoDB error monitor might unnecessary wait log_sys mutex
Analysis: InnoDB error monitor is responsible to call every second
sync_arr_wake_threads_if_sema_free() to wake up possible hanging
threads if they are missed in mutex_signal_object. This is not
possible if error monitor itself is on mutex/semaphore wait. We
should avoid all unnecessary mutex/semaphore waits on error monitor.
Currently error monitor calls function buf_flush_stat_update()
that calls log_get_lsn() function and there we will try to get
log_sys mutex. Better, solution for error monitor is that in
buf_flush_stat_update() we will try to get lsn with
mutex_enter_nowait() and if we did not get mutex do not update
the stats.

Fix: Use log_get_lsn_nowait() function on buf_flush_stat_update()
function. If returned lsn is 0, we do not update flush stats.
log_get_lsn_nowait() will use mutex_enter_nowait() and if
we get mutex we return a correct lsn if not we return 0.
2014-11-13 12:00:57 +02:00
Elena Stepanova
b99328bbf8 Re-enabling tests disabled due to MDEV-5266 and MySQL:65225 (fixed now) 2014-11-17 20:28:18 +04:00
Jan Lindström
8c7ef99bb2 MDEV-7100: InnoDB error monitor might unnecessary wait log_sys mutex
Analysis: InnoDB error monitor is responsible to call every second
sync_arr_wake_threads_if_sema_free() to wake up possible hanging 
threads if they are missed in mutex_signal_object. This is not 
possible if error monitor itself is on mutex/semaphore wait. We 
should avoid all unnecessary mutex/semaphore waits on error monitor.
Currently error monitor calls function buf_flush_stat_update() 
that calls log_get_lsn() function and there we will try to get 
log_sys mutex. Better, solution for error monitor is that in 
buf_flush_stat_update() we will try to get lsn with 
mutex_enter_nowait() and if we did not get mutex do not update 
the stats.

Fix: Use log_get_lsn_nowait() function on buf_flush_stat_update()
function. If returned lsn is 0, we do not update flush stats. 
log_get_lsn_nowait() will use mutex_enter_nowait() and if
we get mutex we return a correct lsn if not we return 0.
2014-11-13 11:24:19 +02:00
Jan Lindström
a03dd94be8 MDEV-6936: Buffer pool list scan optimization
Merged Facebook commit 617aef9f911d825e9053f3d611d0389e02031225
authored by Inaam Rana to InnoDB storage engine (not XtraDB)
from https://github.com/facebook/mysql-5.6

WL#7047 - Optimize buffer pool list scans and related batch processing

Reduce excessive scanning of pages when doing flush list batches. The
fix is to introduce the concept of "Hazard Pointer", this reduces the
time complexity of the scan from O(n*n) to O.

The concept of hazard pointer is reversed in this work. Academically
hazard pointer is a pointer that the thread working on it will declar
such and as long as that thread is not done no other thread is allowe
do anything with it.

In this WL we declare the pointer as a hazard pointer and then if any
thread attempts to work on it, it is allowed to do so but it has to a
the hazard pointer to the next valid value. We use hazard pointer sol
reverse traversal of lists within a buffer pool instance.

Add an event to control the background flush thread. The background f
thread wait has been converted to an os event timed wait so that it c
signalled by threads that want to kick start a background flush when
buffer pool is running low on free/dirty pages.
2014-11-06 13:17:11 +02:00
Jan Lindström
84de277099 Fix error message output if posix_fallocate (trim) is not successfull. 2014-11-05 09:18:47 +02:00
Jan Lindström
8b1b62dd8f Fix compiler failure on Windows. 2014-11-04 15:41:39 +02:00
Thirunarayanan B
821dfcd8d2 Bug #19815702 TIS620: CRASH WITH MULTI TABLE DELETE
Description:
  Using correct length when moving to next field in cmp_ref. The store
length already includes the length bytes of blobs, which is already considered
earlier for blob types.
	Approved by Mattias, Jimmy [rb-7088]
2014-11-04 17:40:29 +05:30
Jan Lindström
251fa7ffc5 Fix error on trim operation alligment. Furthermore, make sure that
we do not return simulated out of file space on read operation,
that would cause crash.
2014-11-04 12:26:48 +02:00
Alexander Barkov
43f185e171 MDEV-5528 Command line variable to choose MariaDB-5.3 vs MySQL-5.6 temporal data formats 2014-11-03 21:45:06 +04:00
Jan Lindström
2da6f7ceba MDEV-7017: Add function to print semaphore waits
Add function to print to stderr all current semaphore 
waits. This function should be able to executed 
inside a gdb/ddd.
2014-11-03 15:43:44 +02:00
Jan Lindström
cb37c55768 MDEV-6929: Port Facebook Prefix Index Queries Optimization
Merge Facebook commit 154c579b828a60722a7d9477fc61868c07453d08
and e8f0052f9b112dc786bf9b957ed5b16a5749f7fd authored
by Steaphan Greene from https://github.com/facebook/mysql-5.6

Optimize prefix index queries to skip cluster index lookup when possible.

Currently InnoDB will always fetch the clustered index (primary key
index) for all prefix columns in an index, even when the value of a
particular record is smaller than the prefix length. This change
optimizes that case to use the record from the secondary index and avoid
the extra lookup.

Also adds two status vars that track how effective this is:

innodb_secondary_index_triggered_cluster_reads:
Times secondary index lookup triggered cluster lookup.

innodb_secondary_index_triggered_cluster_reads_avoided:
Times prefix optimization avoided triggering cluster lookup.
2014-11-03 11:18:52 +02:00
Marko Makela
a265914018 Bug#19904003 INNODB_LIMIT_OPTIMISTIC_INSERT_DEBUG=1 CAUSES INFINITE PAGE SPLIT
The debug configuration parameter innodb_optimistic_insert_debug
which was introduced for testing corner cases in B-tree handling
had a bug in it. The value 1 would trigger an infinite sequence
of page splits.

Fix: When the value 1 is specified, disable this debug feature.
Approved by Yasufumi Kinoshita
2014-10-30 08:53:46 +02:00
Marko Makela
ff906f032f Bug#19904003 INNODB_LIMIT_OPTIMISTIC_INSERT_DEBUG=1 CAUSES INFINITE PAGE SPLIT
The debug configuration parameter innodb_optimistic_insert_debug
which was introduced for testing corner cases in B-tree handling
had a bug in it. The value 1 would trigger an infinite sequence
of page splits.

Fix: When the value 1 is specified, disable this debug feature.
Approved by Yasufumi Kinoshita
2014-10-30 08:53:46 +02:00
Jan Lindström
2bf3e416fe MDEV-6932: Enable Lazy Flushing
Merge Facebook commit 4f3e0343fd2ac3fc7311d0ec9739a8f668274f0d
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6

Adds innodb_idle_flush_pct to enable tuning of the page flushing rate
when the system is relatively idle. We care about this, since doing
extra unnecessary flash writes shortens the lifespan of the flash.
2014-10-29 13:49:12 +02:00
Annamalai Gurusami
ffc33cffe3 Bug #19908343 SERVER CRASHES WHEN EXECUTING ALTER TABLE
Problem:

In the function dict_foreign_remove_from_cache(), the rb tree was updated
without actually verifying whether the given foreign key object is there in the
rb tree or not.  There can be an existing foreign key object with the same id 
in the rb tree, which must not be removed.  Such a scenario comes when an
attempt is made to add a foreign key object with a duplicate identifier.

Solution:

When the foreign key object is removed from the dictionary cache, ensure
that the foreign key object removed from the rbt is the correct one.

rb#7168 approved by Jimmy and Marko.
2014-10-29 16:53:53 +05:30
Annamalai Gurusami
4274242655 Bug #19908343 SERVER CRASHES WHEN EXECUTING ALTER TABLE
Problem:

In the function dict_foreign_remove_from_cache(), the rb tree was updated
without actually verifying whether the given foreign key object is there in the
rb tree or not.  There can be an existing foreign key object with the same id 
in the rb tree, which must not be removed.  Such a scenario comes when an
attempt is made to add a foreign key object with a duplicate identifier.

Solution:

When the foreign key object is removed from the dictionary cache, ensure
that the foreign key object removed from the rbt is the correct one.

rb#7168 approved by Jimmy and Marko.
2014-10-29 16:53:53 +05:30
Jan Lindström
58888e2c08 MDEV-6935: Change the default value for innodb_log_compressed_pages to false
Merge Facebook commit ca40b4417fd224a68de6636b58c92f133703fc68
authored by Steaphan Greene from https://github.com/facebook/mysql-5.6
Change the default value for innodb_log_compressed_pages to false

Logging these pages is a waste. We don't want this to be enabled.

One caution here: If the zlib version used by innodb is changed, but
the running version is still the previous version, and the running
version crashes, it is possible crash recovery could fail.

When crash recovery uses a zlib version at all different than the
version used by the crashed instance, it is possible that a redone
compression could fail, where the original did not, because the new
zlib version compresses the same data to a slightly larger size.

Because of the nature of compression, this is even possible when
upgrading to a version of zlib which actually peforms overall better
compression than the previous version.

If this happens, mysql will fail to recover, since a page split can
not be safely triggered during crash recovery.

So, either the exact zlib version must be controlled between builds,
or these rare recovery failures must be accepted. The cost of
logging these pages is quite high, so we consider this limitation to
be worthwhile.

This failure scenario can not happen if there was a clean shutdown.
This is only relevant to restarting crashed instances, or starting an
instance built via a hot backup too (XtraBackup).
2014-10-29 11:07:38 +02:00
Jan Lindström
2d2d11f02b MDEV-6968: CREATE TABLE crashes with InnoDB plugin
Analysis: fil_extend_space_to_desired_size() does not provide file
node to os_aio(). This failed on Windows only because on Windows
we do not use posix_fallocate() to extend file space.

Fix: Add file node to os_aio() function call and make sure that
we do not use NULL pointer at os_aio_array_reserve_slot(). Additionally,
make sure that we do not use 0 as file_block_size (512 is the minimum).
2014-10-29 11:07:37 +02:00
Jan Lindström
b96697d286 MDEV-6648: InnoDB: Add support for 4K sector size if supported
New generation hard drives, SSDs and NVM devices support 4K
sector size. Supported sector size can be found using fstatvfs()
or GetDiskFreeSpace() functions.
2014-10-29 11:07:11 +02:00