Commit graph

1700 commits

Author SHA1 Message Date
Jan Lindström
18b307a7d2 MDEV-6590: InnoDB recover ends in signal 11
Analysis: When database is migrated from 5.5 or earlier and
database needs crash recovery, there is possibility that
SYS_DATAFILES system table does not exists, but
crash recovery in function dict_check_tablespaces_and_store_max_id()
assumes that SYS_DATAFILES exists.

Fix: If SYS_DATAFILES does not exists, create it before
we end up to function dict_check_tablespaces_and_store_max_id()
on crash recovery.
2014-08-25 13:35:33 +03:00
Michael Widenius
bd2117d154 Automatic merge from 5.5
Fixed 2 failing tests by replacing result files
2014-08-19 21:35:14 +03:00
Michael Widenius
5569132ffe MDEV-6450 - MariaDB crash on Power8 when built with advance tool chain
Part of this work is based on Stewart Smitch's memory barrier and lower priori
patches for power8.

- Added memory syncronization for innodb & xtradb for power8.
- Added HAVE_WINDOWS_MM_FENCE to CMakeList.txt
- Added os_isync to fix a syncronization problem on power
- Added log_get_lsn_nowait which is now used srv_error_monitor_thread to ensur
  if log mutex is locked.

All changes done both for InnoDB and Xtradb
2014-08-19 19:28:35 +03:00
Sergei Golubchik
6fb17a0601 5.5.39 merge 2014-08-07 18:06:56 +02:00
Sergei Golubchik
2023fac281 innodb-5.6.19 2014-08-06 20:05:10 +02:00
Sergei Golubchik
be253829d7 compilation failure on labrador 2014-08-06 10:11:08 +02:00
Sergey Vojtovich
82ce2a2503 MDEV-6450 - MariaDB crash on Power8 when built with advance
tool chain

Reverted addition to the original patch:
InterlockedExchangeAcquire and InterlockedAndRelease are
supported only on Itanium-based systems.
2014-08-04 11:55:38 +04:00
Sergei Golubchik
1c6ad62a26 mysql-5.5.39 merge
~40% bugfixed(*) applied
~40$ bugfixed reverted (incorrect or we're not buggy)
~20% bugfixed applied, despite us being not buggy
(*) only changes in the server code, e.g. not cmakefiles
2014-08-02 21:26:16 +02:00
Sergey Vojtovich
53360fd45f MDEV-6450 - MariaDB crash on Power8 when built with advance
tool chain

This is an addition to the original patch. On Windows
InterlockedExchange implies full memory barrier, whereas
only acquire/release barriers required.
2014-07-31 14:31:05 +04:00
Jan Lindström
c39a501cbe Try to fix compiler error seen on Labrador. 2014-07-31 14:43:35 +03:00
Sergei Golubchik
3e9d4541d5 MDEV-6340 Mariadb 10.0.12 fatal "Lost connection" error w/ GCC 4.9 'Release' build; workaround ~ CFLAGS="-fno-delete-null-pointer-checks"
don't use attribute nonnull for arguments that can be null
2014-07-31 09:51:05 +02:00
Michael Widenius
a1c1700b89 Fixed some compiler warnings 2014-07-30 10:05:01 +03:00
Sergei Golubchik
6ef139780d MDEV-6497 InnoDB deadlocks on UNINSTALL PLUGIN
Free the trx of the current thd (if any) in innobase_end()
2014-07-29 09:09:52 +02:00
Jan Lindström
476feba941 Fix unnecessary printout of same writer thread more than once. Fixed
also a compiler warning.
2014-07-26 20:17:59 +03:00
Jan Lindström
a3acd72570 Merge InnoDB fixes from 5.5 revisions 4229, 4230, 4233, 4237 and 4238 i.e.
4229: MDEV-5670: Assertion failure in file buf0lru.c line 2355
      Add more status information if repeatable.

4230: MDEV-5673: Crash while parallel dropping multiple tables under heavy load
      Improve long semaphore wait output to include all semaphore waits
      and try to find out if there is a sequence of waiters.

4233: Fix compiler errors on product build.

4237: Fix too agressive long semaphore wait output and add guard against introducing
      compression failures on insert buffer.

4238: Fix test failure caused by simulated compression failure on
      IBUF_DUMMY table.
2014-07-25 10:30:16 +03:00
Sergey Vojtovich
6192f0bffa MDEV-6483 - Deadlock around rw_lock_debug_mutex on PPC64
This problem affects only debug builds on PPC64.

There are at least two race conditions around
rw_lock_debug_mutex_enter and rw_lock_debug_mutex_exit:

- rw_lock_debug_waiters was loaded/stored without setting
  appropriate locks/memory barriers.
- there is a gap between calls to os_event_reset() and
  os_event_wait() and in such case we're supposed to pass
  return value of the former to the latter.

Fixed by replacing self-cooked spinlocks with system mutexes.
These days system mutexes offer much better performance. OTOH
performance is not that critical for debug builds.
2014-07-24 18:12:32 +04:00
Jan Lindström
c104965746 Fix test failure caused by simulated compression failure on
IBUF_DUMMY table.
2014-07-25 09:34:05 +03:00
Jan Lindström
7bf45bec54 Fix too agressive long semaphore wait output and add guard against introducing
compression failures on insert buffer.
2014-07-24 14:35:09 +03:00
Jan Lindström
be00265557 Fix compiler errors on product build. 2014-07-23 13:52:17 +03:00
Jan Lindström
a1e41e3258 MDEV-6470: Restrict number of error messages about persistent statictic tables not found
If mysql.innodb_table_stats or mysql.innodb_index_stats is not found or has
unexpected structure output that error only once and no other error for
every table trying to use them. If they do exists, then print fetch or
recalculation errors only once / table or index.
2014-07-22 19:31:45 +03:00
Jan Lindström
ef67c3a239 MDEV-6443: Server crashed with assertaion failure in file ha_innodb.cc
line 8473

In case InnoDB index is not found, print the MySQL and InnoDB index
name we were trying to find and all MySQL and InnoDB index names there
is for this table.
2014-07-22 13:17:16 +03:00
Jan Lindström
fe3859cc9d MDEV-6443: Server crashed with assertaion failure in file
ha_innodb.cc line 8473

If index is not found from InnoDB make sure we print what we
were trying to find and all mysql and InnoDB index names there
is for this table.
2014-07-22 13:08:32 +03:00
Jan Lindström
0bb0230e3a MDEV-6426: Maria DB crashes randomly on creating indexes
Improve OS error messages on Windows.
2014-07-22 10:10:56 +03:00
Sergey Vojtovich
c0ebb3f388 MDEV-6450 - MariaDB crash on Power8 when built with advance tool
chain

InnoDB mutex_exit() function calls __sync_test_and_set() to release
the lock. According to manual this function is supposed to create
"acquire" memory barrier whereas in fact we need "release" memory
barrier at mutex_exit().

The problem isn't repeatable with gcc because it creates
"acquire-release" memory barrier for __sync_test_and_set().
ATC creates just "acquire" barrier.

Fixed by creating proper barrier at mutex_exit() by using
__sync_lock_release() instead of __sync_test_and_set().
2014-07-18 15:16:25 +04:00
Kristian Nielsen
501c56ef1e MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Merge the patches into MariaDB 10.0 main.

With this patch, parallel replication will now automatically retry a
transaction that fails due to deadlock or other temporary error, same as
single-threaded replication.

We catch deadlocks with InnoDB transactions due to enforced commit order. If
T1 must commit before T2 in parallel replication and T1 ends up waiting for T2
inside InnoDB, we kill T2 and retry it later to resolve the deadlock
automatically.
2014-07-11 12:06:47 +02:00
Kristian Nielsen
5b75891b7b Fix compile failure in non-debug build. 2014-07-10 14:24:53 +02:00
Kristian Nielsen
45f6262f54 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes. Fix InnoDB coding style issues.
2014-07-09 13:02:52 +02:00
Kristian Nielsen
92577cc0eb MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Fix small (but nasty) typo.
2014-07-08 14:54:53 +02:00
Kristian Nielsen
98fc5b3af8 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes.

For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).

Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).

(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
2014-07-08 12:54:47 +02:00
Jan Lindström
e005734058 MDEV-6424: Mariadb server crashes with assertion failure in file ha_innodb.cc
Analysis: For some reason table stats for a table pointed from a index 
is not initialized. Added additional warning output on this situation
and table stats initialization. This is better than asserting.
2014-07-08 21:05:18 +03:00
Jan Lindström
648fb98eb6 MDEV-6348: mariadb crash signal 11
Analysis: sync array output function, should make sure that all
used pointers are valid before using them.

Merge revision 4225 from lp:maria/5.5.
2014-07-08 18:51:34 +03:00
Jan Lindström
43c851435f MDEV-6288: Innodb causes server crash after disk full,
then can't ALTER TABLE any more.

Fix for InnoDB storage engine.
2014-07-04 06:31:48 +03:00
Gopal Shankar
119984db0c Bug#18776592 INNODB: FAILING ASSERTION: PRIMARY_KEY_NO == -1 ||
PRIMARY_KEY_NO == 0 

This bug is a backport of the following revision of 5.6 source tree:
# committer: Gopal Shankar <gopal.shankar@oracle.com>
# branch nick: priKey56
# timestamp: Wed 2013-05-29 11:11:46 +0530
# message:
#   Bug#16368875 INNODB: FAILING ASSERTION:
2014-06-25 09:50:17 +05:30
Jan Lindström
ebf3437810 MDEV-5673: Crash while parallel dropping multiple tables under heavy load
Improve long semaphore wait output to include all semaphore waits
and try to find out if there is a sequence of waiters.
2014-07-23 09:04:59 +03:00
Jan Lindström
67eb6f33a9 MDEV-5670: Assertion failure in file buf0lru.c line 2355
Add more status information if repeatable.
2014-07-22 22:08:06 +03:00
unknown
71a596b90c Makes innodb/xtradb compilable in 5.5 2014-07-15 12:37:34 +03:00
Jan Lindström
970163d0be MDEV-6348: mariadb crash signal 11
Analysis: sync array output function, should make sure that all 
used pointers are valid before using them.
2014-07-08 17:21:13 +03:00
Jan Lindström
8d8c456dbb MDEV-5621: Server random crash on ALTER TABLE
This is not a real fix, instead try to gather additional information
at the point when dictionary content is not what we expect it to be.
2014-07-04 12:25:32 +03:00
Jan Lindström
c922048368 MDEV-6191: row_search_for_mysql comment and code consistency about isolation level
and gap locks
2014-07-04 08:42:59 +03:00
unknown
bd4153a8c2 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.

The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.

With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.

Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.

Also some misc. fixes found during development and testing:

 - Remove thd_rpl_is_parallel(), it is not used or needed.

 - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
   worker thread is killed to resolve a deadlock with fixed commit
   ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
   can be ignored if the query otherwise completed successfully, and this
   could cause the deadlock kill to be lost, so that the deadlock was not
   correctly resolved.

 - Fix random test failure due to missing wait_for_binlog_checkpoint.inc.

 - Make sure that deadlock or other temporary errors during parallel
   replication are not printed to the the error log; there were some places
   around the replication code with extra error logging. These conditions can
   occur occasionally and are handled automatically without breaking
   replication, so they should not pollute the error log.

 - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
   the end of a transaction, to be able to detect and resolve deadlocks due to
   commit ordering. But this value was also used as a flag to mark whether
   record_gtid() had been called, by being set to zero, losing the value. Now,
   introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
   valid for the entire duration of the transaction.

 - Fix one place where the code to handle ignored errors called reset_killed()
   unconditionally, even if no error was caught that should be ignored. This
   could cause loss of a deadlock kill signal, breaking deadlock detection and
   resolution.

 - Fix a couple of missing mysql_reset_thd_for_next_command(). This could
   cause a prior error condition to remain for the next event executed,
   causing assertions about errors already being set and possibly giving
   incorrect error handling for following event executions.

 - Fix code that cleared thd->rgi_slave in the parallel replication worker
   threads after each event execution; this caused the deadlock detection and
   handling code to not be able to correctly process the associated
   transactions as belonging to replication worker threads.

 - Remove useless error code in slave_background_kill_request().

 - Fix bug where wfc->wakeup_error was not cleared at
   wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
   error condition to wrongly propagate to a later wait_for_prior_commit(),
   causing spurious ER_PRIOR_COMMIT_FAILED errors.

 - Do not put the binlog background thread into the processlist. It causes
   too many result differences in mtr, but also it probably is not useful
   for users to pollute the process list with a system thread that does not
   really perform any user-visible tasks...
2014-06-10 10:13:15 +02:00
Annamalai Gurusami
b5299f3559 Bug #18806829 OPENING INNODB TABLES WITH MANY FOREIGN KEY REFERENCES IS
SLOW/CRASHES SEMAPHORE

Problem:

There are 2 lakh tables - fk_000001, fk_000002 ... fk_200000.  All of them
are related to the same parent_table through a foreign key constraint.
When the parent_table is loaded into the dictionary cache, all the child table
will also be loaded.  This is taking lot of time.  Since this operation happens
when the dictionary latch is taken, the scenario leads to "long semaphore wait"
situation and the server gets killed.

Analysis:

A simple performance analysis showed that the slowness is because of the
dict_foreign_find() function.  It does a linear search on two linked list
table->foreign_list and table->referenced_list, looking for a particular
foreign key object based on foreign->id as the key.  This is called two
times for each foreign key object.

Solution:

Introduce a rb tree in table->foreign_rbt and table->referenced_rbt, which
are some sort of index on table->foreign_list and table->referenced_list
respectively, using foreign->id as the key.  These rbt structures will be
solely used by dict_foreign_find().  

rb#5599 approved by Vasil
2014-06-10 09:35:50 +05:30
Sergey Vojtovich
23a5b2eb6d MDEV-6103 - Adding/removing non-materialized virtual column triggers
table recreation

Relaxed InnoDB/XtraDB checks to allow online add/drop of
non-materialized virtual columns.
2014-06-03 16:57:29 +04:00
unknown
629b822913 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.

Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.

With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.

The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.

Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.

Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.

Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
2014-06-03 10:31:11 +02:00
Sergei Golubchik
5d16592d44 mysql-5.5.38 merge 2014-06-03 09:55:08 +02:00
Sergei Golubchik
99027efd14 post-fix for the merge of "Bug#16216513 INPLACE ALTER DISABLED FOR PARTITIONED TABLES"
make this innodb-only patch work for other engines as well
2014-05-08 10:25:09 +02:00
Sergei Golubchik
9927b36e87 merge of "Bug#16216513 INPLACE ALTER DISABLED FOR PARTITIONED TABLES"
revno: 4777
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: mysql-5.6
timestamp: Fri 2013-02-15 10:32:25 +0200
message:
  Bug#16216513 INPLACE ALTER DISABLED FOR PARTITIONED TABLES
2014-05-08 10:01:31 +02:00
Sergei Golubchik
8ee9d19607 innodb 5.6.17 2014-05-07 17:32:23 +02:00
Jan Lindström
1adf9e7984 MDEV-6257: MariaDB 5.5 fails to start with 10.0 InnoDB log files
Analysis: Can't disable the error message because you may get database
started with incorrect log file size. 

Fix: Thus only improve the error message to give more information 
to users.
2014-05-22 16:20:56 +03:00
Jan Lindström
75137522b9 MDEV-6257: MariaDB 5.5 fails to start with 10.0 InnoDB log files
Analysis: By default 10.0 creates 48M log files and 5.5 assumes they
are 5M.

Fix: Remove the error and do size comparison later.
2014-05-21 13:14:43 +03:00
Sergei Golubchik
e2e5d07b28 MDEV-6184 10.0.11 merge
InnoDB 5.6.16
2014-05-06 09:57:39 +02:00