Commit graph

15 commits

Author SHA1 Message Date
Vlad Lesin
b54e7b0cea MDEV-31185 rw_trx_hash_t::find() unpins pins too early
rw_trx_hash_t::find() acquires element->mutex, then unpins pins, used for
lf_hash element search. After that the "element" can be deallocated and
reused by some other thread.

If we take a look rw_trx_hash_t::insert()->lf_hash_insert()->lf_alloc_new()
calls, we will not find any element->mutex acquisition, as it was not
initialized yet before it's allocation. rw_trx_hash_t::insert() can reuse
the chunk, unpinned in rw_trx_hash_t::find().

The scenario is the following:

1. Thread 1 have just executed lf_hash_search() in
rw_trx_hash_t::find(), but have not acquired element->mutex yet.
2. Thread 2 have removed the element from hash table with
rw_trx_hash_t::erase() call.
3. Thread 1 acquired element->mutex and unpinned pin 2 pin with
lf_hash_search_unpin(pins) call.
4. Some thread purged memory of the element.
5. Thread 3 reused the memory for the element, filled element->id,
element->trx.
6. Thread 1 crashes with failed "DBUG_ASSERT(trx_id == trx->id)"
assertion.

Note that trx_t objects are also reused, see the code around trx_pools
for details.

The fix is to invoke "lf_hash_search_unpin(pins);" after element->trx is
stored in local variable in rw_trx_hash_t::find().

Reviewed by: Nikita Malyavin, Marko Mäkelä.
2023-05-19 15:50:20 +03:00
Marko Mäkelä
24ec8eaf66 MDEV-15532 after-merge fixes from Monty
The Galera tests were massively failing with debug assertions.
2020-12-02 16:16:29 +02:00
Marko Mäkelä
589cf8dbf3 Merge 10.3 into 10.4 2020-12-01 19:51:14 +02:00
Monty
a3531775b1 MDEV-15532 Assertion `!log->same_pk' failed in row_log_table_apply_delete
The real fix for MDEV-15532 will be pushed into 10.2 and 10.6
This is an additional fix for 10.4.

In 10.4 trans_xa_detach was introduced.  However THD::cleanup() assumes
that after trans_xa_detach() is done, there is no registered transactions
anymore. In the 10.2 patch there will be an assert to ensure this, which
will cause 10.4 to fail.

The fix used is to reset the transaction flags in trans_xa_detach().
2020-11-30 19:53:58 +02:00
Monty
16ea692ed4 MDEV-23586 Mariabackup: GTID saved for replication in 10.4.14 is wrong
MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel
replication

Fixed by partly reverting MDEV-21953 to put back MDL_BACKUP_COMMIT locking
before log_and_order.

The original problem for MDEV-21953 was that while a thread was waiting in
for another threads to commit in 'log_and_order', it had the
MDL_BACKUP_COMMIT lock. The backup thread was waiting to get the
MDL_BACKUP_WAIT_COMMIT lock, which blocks all new MDL_BACKUP_COMMIT locks.
This causes a deadlock as the waited-for thread can never get past the
MDL_BACKUP_COMMIT lock in ha_commit_trans.

The main part of the bug fix is to release the MDL_BACKUP_COMMIT lock while
a thread is waiting for other 'previous' threads to commit. This ensures
that no transactional thread keeps MDL_BACKUP_COMMIT while waiting, which
ensures that there are no deadlocks anymore.
2020-09-25 13:07:03 +03:00
Monty
fc48c8ff4c MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel repl.
The issue was:
T1, a parallel slave worker thread, is waiting for another worker thread to
commit. While waiting, it has the MDL_BACKUP_COMMIT lock.
T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks
all commits.
This causes a deadlock as the thread T1 is waiting for can't commit.

Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to
commit_one_phase_2()

Other things:
- Added a new argument to ha_comit_one_phase() to signal if the
  transaction was a write transaction.
- Ensured that ha_maria::implicit_commit() is always called under
  MDL_BACKUP_COMMIT. This code is not needed in 10.5
- Ensure that MDL_Request values 'type' and 'ticket' are always
  initialized. This makes it easier to check the state of the MDL_Request.
- Moved thd->store_globals() earlier in handle_rpl_parallel_thread() as
  thd->init_for_queries() could use a MDL that could crash if store_globals
  where not called.
- Don't call ha_enable_transactions() in THD::init_for_queries() as this
  is both slow (uses MDL locks) and not needed.
2020-07-21 12:42:42 +03:00
Sergey Vojtovich
5679a2b6b3 Shrink my_atomic.h and my_cpu.h scope 2020-04-15 22:23:03 +04:00
Marko Mäkelä
dba743b653 Fix -Wunused-variable 2019-10-12 06:57:02 +03:00
Sergey Vojtovich
1599825ffc trans_xa_detach() framework
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
210855ce5d Move XID_STATE::xid to XID_cache_element
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
b7fd7ce286 Moved normal transaction xid to implicit_xid
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
228514e52f Move XID_STATE::xa_state to XID_cache_element
Simplified away XA_NOTR, use XID_STATE::is_explicit_XA() instead.

Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
a168cfb396 Move XID_state::xa_state handing inside xa.cc
Let xid_cache_insert()/xid_cache_delete() handle xa_state.

Let session tracker use is_explicit_XA() rather than xa_state != XA_NOTR.

Fixed open_tables() to refuse data access in XA_ROLLBACK_ONLY state.

Removed dead code from THD::cleanup(). It was supposed to be a reminder,
but it got messed up over time.

spider_internal_start_trx() is called either with XA_NOTR or XA_ACTIVE,
which is guarded by server callers. Thus is_explicit_XA() is acceptable
replacement for XA_ACTIVE check (which was likely wrong anyway).

Setting xa_state to XA_PREPARED in spider_internal_xa_prepare() isn't
meaningful, as this value is never accessed later. It can't be accessed
by current thread and it can't be recovered either. It can only be
accessed by spider internally, which never happens.

Make spider_xa_lock()/spider_xa_unlock() static.

Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
f189f34ed4 Move XID_STATE::rm_error to XID_cache_element
XID_STATE::rm_error is never used by internal 2PC, it is intended to be
used by explicit XA only.

Also removed redundant xid reset from THD::init_for_queries(). Must've
been done already either by THD::transaction constructor or by
THD::cleanup().

Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00
Sergey Vojtovich
07140f171d Just move, no code changes otherwise.
Part of MDEV-7974 - backport fix for mysql bug#12161 (XA and binlog)
2019-04-25 15:06:40 +04:00