Unlike commit a54abf0175 claimed,
the caller of THD::awake() may actually hold the InnoDB lock_sys->mutex.
That commit introduced a deadlock of threads in the replication slave
when running the test rpl.rpl_parallel_optimistic_nobinlog.
lock_trx_handle_wait(): Expect the callers to acquire and release
lock_sys->mutex and trx->mutex.
innobase_kill_query(): Restore the logic for conditionally acquiring
and releasing the mutexes. THD::awake() can be called from inside
InnoDB while holding one or both mutexes, via thd_report_wait_for() and
via wsrep_innobase_kill_one_trx().
the non-recursive CTE defined with UNION
The problem appears as the columns of the non-recursive CTE weren't renamed.
The renaming procedure was called for recursive CTEs only.
To fix it in the procedure st_select_lex_unit::prepare
With_element::rename_columns_of_derived_unit is called now for both CTEs:
recursive and non-recursive.
ha_innobase::unlock_row(): Use a relaxed version of the
trx_state_eq() debug assertion, because rr_unlock_row()
may be invoked after an error has been already reported
and the transaction has been rolled back.
By definition, c_lock->trx->lock.wait_lock==c_lock cannot hold.
That is, the owner transaction of a lock cannot be waiting for that
particular lock. It must have been waiting for some other lock.
Remove the dead code related to that. Also, test c_lock for NULLness
only once.
Two changes were made to the test:
1) Suppress warning "Refusing exit for the last slave thread."
This warning was already suppressed, but on the wrong node.
2) The test occasionally fails because it expects that the
number of applier threads changes immediately after
changing the value of ```variable wsrep_slave_threads```.
Which is not true. This patch turns snippets like this:
```
SET GLOBAL wsrep_slave_threads = x;
SELECT COUNT(*) = x FROM INFORMATION_SCHEMA.PROCESSLIST WHERE USER = 'system user';
```
Into proper wait_conditions:
```
SET GLOBAL wsrep_slave_threads = x;
let $wait_condition = SELECT COUNT(*) = x FROM ...;
--source include/wait_condition.inc
```
As this is the only moderately critical fopened for writing file,
create an alternate path to use open and fdopen for non-glibc platforms
that support O_CLOEXEC (BSDs).
Tested on Linux (by modifing the GLIBC defination) to take this
alternate path:
$ cd /proc/23874
$ more fdinfo/71
pos: 0
flags: 02100001
mnt_id: 24
$ ls -la fd/71
l-wx------. 1 dan dan 64 Mar 14 13:30 fd/71 -> /dev/shm/var_auto_i7rl/mysqld.1/data/ib_buffer_pool.incomplete
"mtr func_date_add" failed on 32-bit platforms. Removing a wrong case to "long".
Both values[] and log_10_int[] are arrays of "ulonglong", no cast is needed.
Refactor get_datetime_value() not to create Item_cache_temporal(),
but do it always in ::fix_fields() or ::fix_length_and_dec().
Creating items at the execution time doesn't work very well with
virtual columns and check constraints that are fixed and executed
in different THDs.
Do not assume that it's always item->field_type() - this is not the case
in temporal comparisons (e.g. when comparing DATETIME column with a TIME
literal).
It's a generic function, not using anything from Arg_comparator.
Make it a static function, not a class method, to be able to use
it later without Arg_comparator
reorder items in args[] array. Instead of
when1,then1,when2,then2,...[,case][,else]
sort them as
[case,]when1,when2,...,then1,then2,...[,else]
in this case all items used for comparison take a continuous part
of the array and can be aggregated directly. and all items that
can be returned take a continuous part of the array and can be
aggregated directly. Old code had to copy them to a temporary
array before aggreation, and then copy back (thd->change_item_tree)
everything that was changed.
musl ships the header for other purposes, but makecontext is not
implemented. fix the check to detect if makecontext is implemented
before enabling code using it.
Test galera.MW-366 is not deterministic and depends on timing assumptions.
The test occasionally fails after checking the number of 'system user'
processes in processlist after changing the value of variable global
wsrep_slave_threads, like this:
```
SET GLOBAL wsrep_slave_threads = x;
--sleep 0.5
SELECT COUNT(*) = x FROM INFORMATION_SCHEMA.PROCESSLIST WHERE USER = 'system user';
```
The problem is that the number of slave threads is internally adjusted
'asynchronously', and it may take some time to spawn/kill new threads,
especially in a heavily loaded system.
This patch removes the '--sleep 0.5' statements from the test and replaces
those with appropriate wait conditions, like this:
```
SET GLOBAL wsrep_slave_threads = x;
let $wait_condition = SLECT COUNT(*) = x FROM ...;
--source include/wait_condition.inc
```
out of order at retry
The test failures were of two sorts. One is that the number of retries
what the slave thought as a temporary error exceeded
the default value of the slave retry option.
The 2nd issue was an out of order commit by transactions that
were supposed to error out instead.
Both issues are caused by the same reason that the post-temporary-error
retry did not check possibly already existing error status.
This is mended with refining conditions to retry. Specifically, a retrying
worker checks `rpl_parallel_entry::stop_on_error_sub_id` that
a potential failing predecessor could set to its own sub id.
Now should the member be set the retrying follower errors out with
ER_PRIOR_COMMIT_FAILED.
This patch re-enables test galera.galera_var_max_ws_rows.
The test did not work because there were two distinct places where
the server was incrementing member THD::wsrep_affected_rows before
enforcing wsrep_max_ws_rows. Essentially, the test would fail because
every inserted row was counted twice.
The patch removes the extra code.
Problem:
=======
Mariabackup exits during prepare phase if it encounters
MLOG_INDEX_LOAD redo log record. MLOG_INDEX_LOAD record
informs Mariabackup that the backup cannot be completed based
on the redo log scan, because some information is purposely
omitted due to bulk index creation in ALTER TABLE.
Solution:
========
Detect the MLOG_INDEX_LOAD redo record during backup phase and
exit the mariabackup with the proper error message.
buf_flush_page_cleaner_coordinator(): Signal the worker threads
to exit while waiting for them to exit. Apparently, signals are
sometimes lost, causing shutdown to occasionally hang when
multiple page cleaners (and buffer pool instances) are used,
that is, when innodb_buffer_pool_size is at least 1 GiB.
buf_flush_page_cleaner_close(): Merge with the only caller.