Commit graph

162 commits

Author SHA1 Message Date
Marko Mäkelä
c051eaba46 MDEV-14988 innodb_read_only tries to modify files if transactions were recovered in COMMITTED state
lock_trx_release_locks(): Relax a debug assertion to allow
recovered TRX_STATE_COMMITTED_IN_MEMORY transactions.

trx_commit_in_memory(): Add DEBUG_SYNC instrumentation.

trx_undo_insert_cleanup(): Skip persistent changes if innodb_read_only
is set. This should only happen when a recovered committed transaction
would be cleaned up at shutdown.
2018-02-13 14:29:32 +02:00
Marko Mäkelä
431607237d MDEV-12173 "Error: trying to do an operation on a dropped tablespace"
InnoDB is issuing a 'noise' message that is not a sign of abnormal
operation. The only issuers of it are the debug function
lock_rec_block_validate() and the change buffer merge.
While the error should ideally never occur in transactional locking,
we happen to know that DISCARD TABLESPACE and TRUNCATE TABLE and
possibly DROP TABLE are breaking InnoDB table locks.

When it comes to the change buffer merge, the message simply is useless
noise. We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending. And the code is prepared to handle that,
which is demonstrated by the fact that whenever the message was issued,
InnoDB did not crash.

fil_inc_pending_ops(): Remove the parameter print_err.
2018-01-22 16:58:13 +02:00
Marko Mäkelä
b933a8c354 MDEV-12569 InnoDB suggests filing bugs at MySQL bug tracker
Replace all references in InnoDB and XtraDB error log messages
to bugs.mysql.com with references to https://jira.mariadb.org/.

The original merge
commit 4274d0bf57
was accidentally reverted by the subsequent merge
commit 3b35d745c3
2017-10-26 13:29:28 +03:00
Vicențiu Ciorbaru
3b35d745c3 Merge branch 'merge-innodb-5.6' into 10.0 2017-10-26 12:46:47 +03:00
Marko Mäkelä
4274d0bf57 Merge 5.5 into 10.0 2017-10-26 11:13:07 +03:00
Marko Mäkelä
cfb3361748 MDEV-12569 InnoDB suggests filing bugs at MySQL bug tracker
Replace all references in InnoDB and XtraDB error log messages
to bugs.mysql.com with references to https://jira.mariadb.org/.
2017-10-26 11:02:19 +03:00
Marko Mäkelä
6b45355e6b MDEV-13103 Assertion `flags & BUF_PAGE_PRINT_NO_CRASH' failed in buf_page_print
buf_page_print(): Remove the parameter 'flags',
and when a server abort is intended, perform that in the caller.

In this way, page corruption reports due to different reasons
can be distinguished better.

This is non-functional code refactoring that does not fix any
page corruption issues. The change is only made to avoid falsely
grouping together unrelated causes of page corruption.
2017-09-06 14:01:15 +03:00
Marko Mäkelä
4754f88cff Never pass NULL to innobase_get_stmt() 2017-05-17 08:11:01 +03:00
Marko Mäkelä
ff16609374 MDEV-12674 Innodb_row_lock_current_waits has overflow
There is a race condition related to the variable
srv_stats.n_lock_wait_current_count, which is only
incremented and decremented by the function lock_wait_suspend_thread(),

The incrementing is protected by lock_sys->wait_mutex, but the
decrementing does not appear to be protected by anything.
This mismatch could allow the counter to be corrupted when a
transactional InnoDB table or record lock wait is terminating
roughly at the same time with the start of a wait on a
(possibly different) lock.

ib_counter_t: Remove some unused methods. Prevent instantiation for N=1.
Add an inc() method that takes a slot index as a parameter.

single_indexer_t: Remove.

simple_counter<typename Type, bool atomic=false>: A new counter wrapper.
Optionally use atomic memory operations for modifying the counter.
Aligned to the cache line size.

lsn_ctr_1_t, ulint_ctr_1_t, int64_ctr_1_t: Define as simple_counter<Type>.
These counters are either only incremented (and we do not care about
losing some increment operations), or the increment/decrement operations
are protected by some mutex.

srv_stats_t::os_log_pending_writes: Document that the number is protected
by log_sys->mutex.

srv_stats_t::n_lock_wait_current_count: Use simple_counter<ulint, true>,
that is, atomic inc() and dec() operations.

lock_wait_suspend_thread(): Release the mutexes before incrementing
the counters. Avoid acquiring the lock mutex if the lock wait has
already been resolved. Atomically increment and decrement
srv_stats.n_lock_wait_current_count.

row_insert_for_mysql(), row_update_for_mysql(),
row_update_cascade_for_mysql(): Use the inc() method with the trx->id
as the slot index. This is a non-functional change, just using
inc() instead of add(1).

buf_LRU_get_free_block(): Replace the method add(index, n) with inc().
There is no slot index in the simple_counter.
2017-05-12 12:24:53 +03:00
Sergei Golubchik
a79d46c3a4 Merge branch 'merge-innodb-5.6' into 10.0 2016-06-21 14:58:19 +02:00
Sergei Golubchik
720e04ff67 5.6.31 2016-06-21 14:21:03 +02:00
Sergei Golubchik
c8fcaf8aec Merge branch 'merge-innodb-5.6' into 10.0 2016-02-16 18:32:59 +01:00
Sergei Golubchik
220e70fadc 5.6.29 2016-02-16 12:07:18 +01:00
Sergei Golubchik
04af573d65 Merge branch 'merge-innodb-5.6' into 10.0 2015-10-09 17:47:30 +02:00
Sergei Golubchik
86ff4da14d 5.6.27 2015-10-09 17:21:46 +02:00
Aditya A
608efca4c4 Bug #21025880 DUPLICATE UK VALUES IN READ-COMMITTED (AGAIN)
PROBLEM

Whenever we insert in unique secondary index we take shared
locks on all possible duplicate record present in the table.
But while during a replace on the unique secondary index ,
we take exclusive and locks on the all duplicate record.
When the records are deleted, they are first delete marked
and later purged by the purge thread. While purging the
record we call the lock_update_delete() which in turn calls
lock_rec_inherit_to_gap() to inherit locks of the deleted
records. In repeatable read mode we inherit all the locks
from the record to the next record  but in the read commited
mode we skip inherting them as gap type locks. We make a
exception here if the lock on the records is  in shared mode
,we assume that it is set during insert for unique secondary
index and needs to be inherited to stop constraint violation.
We didnt handle the case when exclusive locks are set during
replace, we skip inheriting locks of these records and hence
causing constraint violation.

FIX

While inheriting the locks,check whether the transaction is
allowed to do TRX_DUP_REPLACE/TRX_DUP_IGNORE, if true
inherit the locks.

[ Revewied by Jimmy #rb9709]
2015-08-12 19:17:26 +05:30
Sergei Golubchik
70a3fec400 InnoDB-5.6.24 2015-05-05 00:06:23 +02:00
Sergei Golubchik
085297a121 5.6.24 2015-05-04 22:13:46 +02:00
Sergei Golubchik
6d06fbbd1d move to storage/innobase 2015-05-04 19:17:21 +02:00
Kristian Nielsen
184f718fef MDEV-7249: Performance problem in parallel replication with multi-level slaves
Parallel replication (in 10.0 / "conservative" mode) relies on binlog group
commits to group transactions that can be safely run in parallel on the
slave. The --binlog-commit-wait-count and --binlog-commit-wait-usec options
exist to increase the number of commits per group. But in case of conflicts
between transactions, this can cause unnecessary delay and reduced througput,
especially on a slave where commit order is fixed.

This patch adds a heuristics to reduce this problem. When transaction T1 goes
to commit, it will first wait for N transactions to queue up for a group
commit. However, if we detect that another transaction T2 is waiting for a row
lock held by T1, then we will skip the wait and let T1 commit immediately,
releasing locks and let T2 continue.

On a slave, this avoids the unfortunate situation where T1 is waiting for T2
to join the group commit, but T2 is waiting for T1 to release locks, causing
no work to be done for the duration of the --binlog-commit-wait-usec timeout.

(The heuristic seems reasonable on the master as well, so it is enabled for
all transactions, not just replication transactions).
2015-03-13 14:01:52 +01:00
Jan Lindström
8799f87075 MDEV-7623: Add lock wait time and hold time to every record/table lock in
InnoDB transaction lock printout.
2015-02-24 10:33:49 +02:00
Jan Lindström
90635c6fb5 MDEV-7620: Transaction lock wait is missing number of lock
waits and total wait time.
2015-02-23 11:24:19 +02:00
Sergei Golubchik
6b05688f6d innodb 5.6.23 2015-02-18 17:59:21 +01:00
Jan Lindström
56da6252f7 Improve InnoDB transaction lock output by providing number of table locks
this transaction is currently holding and total number of table locks to
the table where lock is held.
2015-02-10 15:15:27 +02:00
Sergei Golubchik
476a8660e6 InnoDB 5.6.22 2015-01-21 14:33:39 +01:00
Sergei Golubchik
a9a6bd5256 InnoDB 5.6.21 2014-11-20 16:59:22 +01:00
Jan Lindström
7e71dfa9f5 MDEV-6933: Spurious lock_wait_timeout_thread wakeup in lock_wait_suspend_thread()
Merged Facebooks commit 6e06bbfa315ffb97d713dd6e672d6054036ddc21
authored by Inaam Rana from https://github.com/facebook/mysql-5.6.

Fixes MySQL bug http://bugs.mysql.com/bug.php?id=72123

lock_timeout thread works in a tight loop waking up every second
and checking for lock_wait_timeout. In addition, when a mysql
thread is forced to wait on a lock, it signals the lock_timeout thread
as well. This call is not required. In a heavily contended workload
each thread going to wait will signal the lock_timeout thread making
it work all the time. As lock_timeout thread scans the array of
waiting threads under lock_sys::wait_mutex which is already very
hot in contneded loads, these extra scans can cause significanct
performance regression.

Also, in various codepaths lock_timeout thread is signalled where
actual intention was to signal the innodb monitor thread.
2014-10-24 17:56:04 +03:00
Kristian Nielsen
5b75891b7b Fix compile failure in non-debug build. 2014-07-10 14:24:53 +02:00
Kristian Nielsen
45f6262f54 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes. Fix InnoDB coding style issues.
2014-07-09 13:02:52 +02:00
Kristian Nielsen
92577cc0eb MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
Fix small (but nasty) typo.
2014-07-08 14:54:53 +02:00
Kristian Nielsen
98fc5b3af8 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel replication causing replication to fail.
After-review changes.

For this patch in 10.0, we do not introduce a new public storage engine API,
we just fix the InnoDB/XtraDB issues. In 10.1, we will make a better public
API that can be used for all storage engines (MDEV-6429).

Eliminate the background thread that did deadlock kills asynchroneously.
Instead, we ensure that the InnoDB/XtraDB code can handle doing the kill from
inside the deadlock detection code (when thd_report_wait_for() needs to kill a
later thread to resolve a deadlock).

(We preserve the part of the original patch that introduces dedicated mutex
and condition for the slave init thread, to remove the abuse of
LOCK_thread_count for start/stop synchronisation of the slave init thread).
2014-07-08 12:54:47 +02:00
unknown
bd4153a8c2 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

Remove the temporary fix for MDEV-5914, which used READ COMMITTED for parallel
replication worker threads. Replace it with a better, more selective solution.

The issue is with certain edge cases of InnoDB gap locks, for example between
INSERT and ranged DELETE. It is possible for the gap lock set by the DELETE to
block the INSERT, if the DELETE runs first, while the record lock set by
INSERT does not block the DELETE, if the INSERT runs first. This can cause a
conflict between the two in parallel replication on the slave even though they
ran without conflicts on the master.

With this patch, InnoDB will ask the server layer about the two involved
transactions before blocking on a gap lock. If the server layer tells InnoDB
that the transactions are already fixed wrt. commit order, as they are in
parallel replication, InnoDB will ignore the gap lock and allow the two
transactions to proceed in parallel, avoiding the conflict.

Improve the fix for MDEV-6020. When InnoDB itself detects a deadlock, it now
asks the server layer for any preferences about which transaction to roll
back. In case of parallel replication with two transactions T1 and T2 fixed to
commit T1 before T2, the server layer will ask InnoDB to roll back T2 as the
deadlock victim, not T1. This helps in some cases to avoid excessive deadlock
rollback, as T2 will in any case need to wait for T1 to complete before it can
itself commit.

Also some misc. fixes found during development and testing:

 - Remove thd_rpl_is_parallel(), it is not used or needed.

 - Use KILL_CONNECTION instead of KILL_QUERY when a parallel replication
   worker thread is killed to resolve a deadlock with fixed commit
   ordering. There are some cases, eg. in sql/sql_parse.cc, where a KILL_QUERY
   can be ignored if the query otherwise completed successfully, and this
   could cause the deadlock kill to be lost, so that the deadlock was not
   correctly resolved.

 - Fix random test failure due to missing wait_for_binlog_checkpoint.inc.

 - Make sure that deadlock or other temporary errors during parallel
   replication are not printed to the the error log; there were some places
   around the replication code with extra error logging. These conditions can
   occur occasionally and are handled automatically without breaking
   replication, so they should not pollute the error log.

 - Fix handling of rgi->gtid_sub_id. We need to be able to access this also at
   the end of a transaction, to be able to detect and resolve deadlocks due to
   commit ordering. But this value was also used as a flag to mark whether
   record_gtid() had been called, by being set to zero, losing the value. Now,
   introduce a separate flag rgi->gtid_pending, so rgi->gtid_sub_id remains
   valid for the entire duration of the transaction.

 - Fix one place where the code to handle ignored errors called reset_killed()
   unconditionally, even if no error was caught that should be ignored. This
   could cause loss of a deadlock kill signal, breaking deadlock detection and
   resolution.

 - Fix a couple of missing mysql_reset_thd_for_next_command(). This could
   cause a prior error condition to remain for the next event executed,
   causing assertions about errors already being set and possibly giving
   incorrect error handling for following event executions.

 - Fix code that cleared thd->rgi_slave in the parallel replication worker
   threads after each event execution; this caused the deadlock detection and
   handling code to not be able to correctly process the associated
   transactions as belonging to replication worker threads.

 - Remove useless error code in slave_background_kill_request().

 - Fix bug where wfc->wakeup_error was not cleared at
   wait_for_commit::unregister_wait_for_prior_commit(). This could cause the
   error condition to wrongly propagate to a later wait_for_prior_commit(),
   causing spurious ER_PRIOR_COMMIT_FAILED errors.

 - Do not put the binlog background thread into the processlist. It causes
   too many result differences in mtr, but also it probably is not useful
   for users to pollute the process list with a system thread that does not
   really perform any user-visible tasks...
2014-06-10 10:13:15 +02:00
unknown
629b822913 MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.

In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.

Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.

With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.

The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.

Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.

Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.

Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
2014-06-03 10:31:11 +02:00
Sergei Golubchik
8ee9d19607 innodb 5.6.17 2014-05-07 17:32:23 +02:00
Sergei Golubchik
27d45e4696 MDEV-5574 Set AUTO_INCREMENT below max value of column.
Update InnoDB to 5.6.14
Apply MySQL-5.6 hack for MySQL Bug#16434374
Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB)
Fix InnoDB memory leak
2014-02-01 09:33:26 +01:00
Murthy Narkedimilli
cf2d852653 Fixing the bug 16919882 - WRONG FSF ADDRESS IN LICENSES HEADERS 2013-06-10 22:29:41 +02:00
Michael Widenius
068c61978e Temporary commit of 10.0-merge 2013-03-26 00:03:13 +02:00
Aditya A
4137279353 Bug#16268289 LOCK_REC_VALIDATE_PAGE() MAY DEREFERENCE A POINTER TO A
FREED LOCK

ANALYIS
-------

In 5.5 code the lock_rec_block_validate() is called after releasing
the kernel mutex. There is a chance that the lock might be invalid so,
we are getting the valgrind error on invalid read on lock->index.

FIX
---

Fix would be to copy the lock->index when we are holding the kernel mutex 
and then pass it to the lock_rec_block_validate(). This implementation
is present in 5.1 code.  

[ Approved by sunny rb.no.oracle.com/rb/r/2152/ ]
2013-03-13 11:43:21 +05:30
Sergei Golubchik
ab83952f29 10.0-base merge 2013-01-31 09:48:19 +01:00
Annamalai Gurusami
a69f4a0573 Bug #16004999 ASSERT STATE == TRX_STATE_NOT_STARTED, UNLOCK_ROW()
Problem:

During the index intersect access method, the SQL layer will access one row,
that satisfies a set of conditions, using an index i1.  And then it will try to
access the same row, with other set of conditions using the next index i2.  If
the fetch from i2 fails (we are talking about an error situation here and not
simply an unmatched row situation), then it will unlock the row accessed via
i1.  This will work in all situations except deadlock error.

When a deadlock happens, InnoDB will rollback the transaction.  InnoDB intimates
the SQL layer about this through the THD::transaction_rollback_request member.
But this is not currently used by the SQL layer.

Solution:

When an error happens, the SQL layer must check the 
THD::transaction_rollback_request member, before calling handler::unlock_row().
We have also added a debug assert in ha_innobase::unlock_row() checking that
it must be called only when the transaction is in active state.

rb#1773 approved by Marko and Sunny.
2013-01-10 10:28:04 +05:30
Yasufumi Kinoshita
eb6a89b4d1 Bug#59354 : Bug #12659252 : ASSERT !OTHER_LOCK AT LOCK_REC_ADD_TO_QUEUE DURING A DELETE OPERATION
The converted implicit lock should wait for the prior conflicting lock if found.

rb://1437 approved by Marko
2012-11-28 17:07:02 +09:00
Yasufumi Kinoshita
47619514f5 Bug#59354 : Bug #12659252 : ASSERT !OTHER_LOCK AT LOCK_REC_ADD_TO_QUEUE DURING A DELETE OPERATION
The converted implicit lock should wait for the prior conflicting lock if found.

rb://1437 approved by Marko
2012-11-28 17:05:23 +09:00
Vasil Dimov
95264568e7 This is a backport of "WL#5674 InnoDB: report all deadlocks (Bug#1784)"
from MySQL 5.6 into MySQL 5.5

Will close Bug#14515889 BACKPORT OF INNODB DEADLOCK LOGGING TO 5.5

The original implementation is in
vasil.dimov@oracle.com-20101213120811-k2ldtnao2t6zrxfn

Approved by:	Jimmy (rb:1535)
2012-11-12 14:24:43 +02:00
Michael Widenius
1d0f70c2f8 Temporary commit of merge of MariaDB 10.0-base and MySQL 5.6 2012-08-01 17:27:34 +03:00
Inaam Rana
3c1bdb356d merge from 5.1 2012-03-15 13:34:50 -04:00
Inaam Rana
d86c431bdd merge from 5.1 2012-03-15 13:34:50 -04:00
Inaam Rana
e56854d8e5 Bug#13789853 SHOW ENGINE INNODB STATUS HANGS DUE TO EXCESSIVE WORK
IN LOCK_VALIDATE()

rb://917
approved by: Marko Makela

In lock_validate() the limit is used to release the kernel_mutex during
the validation, to obey the latching order.
If we do the limit++ then we are rechecking the same lock most times on
each iteration because limit is being incremented by one and
<space, page_no> will nearly always be > limit. If we set the limit
correctly to (space, page+1) then we are actually making progress
during the iteration.
2012-03-12 13:04:54 -04:00
Inaam Rana
273c626269 Bug#13789853 SHOW ENGINE INNODB STATUS HANGS DUE TO EXCESSIVE WORK
IN LOCK_VALIDATE()

rb://917
approved by: Marko Makela

In lock_validate() the limit is used to release the kernel_mutex during
the validation, to obey the latching order.
If we do the limit++ then we are rechecking the same lock most times on
each iteration because limit is being incremented by one and
<space, page_no> will nearly always be > limit. If we set the limit
correctly to (space, page+1) then we are actually making progress
during the iteration.
2012-03-12 13:04:54 -04:00
Marko Mäkelä
39100cd984 Bug #13651627 Move ut_ad(0) from the beginning to the end of buf_page_print(),
print page dump

buf_page_print(): Remove the ut_ad(0) from the beginning. Add two flags
(enum buf_page_print_flags) that can be bitwise-ORed together:

BUF_PAGE_PRINT_NO_CRASH:
  Do not crash debug builds at the end of buf_page_print().
BUF_PAGE_PRINT_NO_FULL:
  Do not print the full page dump. This can be useful when adding
  diagnostic printout to flushing or to the doublewrite buffer.

trx_sys_doublewrite_init_or_restore_page(): Replace exit(1) with ut_error,
so that we can get a core dump if this extraordinary condition happens.

rb:924 approved by Sunny Bains
2012-02-02 12:31:57 +02:00
Marko Mäkelä
d87a29bea9 Bug #13651627 Move ut_ad(0) from the beginning to the end of buf_page_print(),
print page dump

buf_page_print(): Remove the ut_ad(0) from the beginning. Add two flags
(enum buf_page_print_flags) that can be bitwise-ORed together:

BUF_PAGE_PRINT_NO_CRASH:
  Do not crash debug builds at the end of buf_page_print().
BUF_PAGE_PRINT_NO_FULL:
  Do not print the full page dump. This can be useful when adding
  diagnostic printout to flushing or to the doublewrite buffer.

trx_sys_doublewrite_init_or_restore_page(): Replace exit(1) with ut_error,
so that we can get a core dump if this extraordinary condition happens.

rb:924 approved by Sunny Bains
2012-02-02 12:31:57 +02:00