In the merge eae968f62d, it turns out that
I had accidentally initiated an in-source build in the past, and
$MYSQL_TZINFO_TO_SQL was pointing to a stale copy of the executable in
the source directory, instead of the correct one in the build directory.
The test encryption.create_or_replace would occasionally fail,
because some fil_space_t::n_pending_ops would never be decremented.
fil_crypt_find_space_to_rotate(): If rotate_thread_t::should_shutdown()
holds due to innodb_encryption_threads having been reduced, do
release the reference.
fil_space_remove_from_keyrotation(), fil_space_next(): Declare the
functions static, simplify a little, and define in the same compilation
unit with the only caller, fil_crypt_find_space_to_rotate().
fil_crypt_key_mutex: Remove (unused).
Example of the failure:
http://buildbot.askmonty.org/buildbot/builders/bld-p9-rhel7/builds/4417/steps/mtr/logs/stdio
```
main.mysqld--help 'unix' w17 [ fail ]
Test ended at 2020-06-20 18:51:45
CURRENT_TEST: main.mysqld--help
--- /opt/buildbot-slave/bld-p9-rhel7/build/mysql-test/main/mysqld--help.result 2020-06-20 16:06:49.903604179 +0300
+++ /opt/buildbot-slave/bld-p9-rhel7/build/mysql-test/main/mysqld--help.reject 2020-06-20 18:51:44.886766820 +0300
@@ -1797,10 +1797,10 @@
sync-relay-log-info 10000
sysdate-is-now FALSE
system-versioning-alter-history ERROR
-table-cache 421
+table-cache 2000
table-definition-cache 400
-table-open-cache 421
-table-open-cache-instances 1
+table-open-cache 2000
+table-open-cache-instances 8
tc-heuristic-recover OFF
tcp-keepalive-interval 0
tcp-keepalive-probes 0
mysqltest: Result length mismatch
```
mtr: table_open_cache_basic autosized:
Lets assume that >400 are available and that
we can set the result back to the start value.
All of these system variables are autosized and can
generate MTR output differences.
Closes#1527
Problem:
The crash happened in FORMAT(double, dec>=31, 'de_DE').
The patch for MDEV-23118 (commit 0041dacc1b)
did not take into account that String::set_real() has a limit of 31
(FLOATING_POINT_DECIMALS) fractional digits. So for the range of 31..38
digits, set_real() switches to use:
- my_fcvt() - decimal point notation, e.g. 1.9999999999
- my_gcvt() - scientific notation, e.g. 1e22
my_gcvt() returned a shorter string than Item_func_format::val_str_ascii()
expected to get after the my_fcvt() call, so it crashed on assert.
Solution:
We cannot extend set_real() to use the my_fcvt() mode for the range of
31..38 fractional digits, because set_real() is used in a lot of places
and such a change will break everything.
Introducing String::set_fcvt() which always prints using my_fcvt()
for the whole range of decimals 0..38, supported by the FORMAT() function.
fix_fields for the arguments of the NTH_VALUE function was updating the same reference,
so for the second argument (or after the first argument) the items were not resolved
to their corresponding field from the view as they were updating the reference to the
first argument.
depending on build config the error might be hidded,
in particular liblz4.so and libjemalloc.so make it to disappear,
but with -DWITH_INNODB_LZ4=NO -DWITH_JEMALLOC=NO it reappears.
Removing the ORDER BY clause from the UNION when UNION is inside an IN/ALL/ANY/EXISTS subquery.
The rewrites are done for subqueries but this rewrite is not done for the fake_select of
the UNION.
The issue here is when records are read from the temporary file
(filesort result in this case) via a cache(rr_from_cache).
The cache is initialized with init_rr_cache.
For correlated subquery the cache allocation is happening at each execution
of the subquery but the deallocation happens only once and that was
when the query execution was done.
So generally for subqueries we do two types of cleanup
1) Full cleanup: we should free all resources of the query(like temp tables).
This is done generally when the query execution is complete or the subquery
re-execution is not needed (case with uncorrelated subquery)
2) Partial cleanup: Minor cleanup that is required if
the subquery needs recalculation. This is done for all the structures that
need to be allocated for each execution (example SORT_INFO for filesort
is allocated for each execution of the correlated subquery).
The fix here would be free the cache used by rr_from_cache in the partial
cleanup phase.
FORMAT() can print more integer digits (than the argument has)
if rounding happens:
FORMAT(9.9,0) -> '10'
The old code did not take this into account.
Fix:
1. One extra digit is needed in case of rounding
- If args[1] is a not-NULL constant, then reserve space for one extra integer
digit if the requested number of decimals is less than args[0]->decimals.
- Otherwise, reserve space for one extra integer digit if
args[0]->decimals is not 0, because rounding can potentially happen
(depending on the exact data in arguments).
2. One extra digit is also needed if the argument has no integer digits,
e.g. in a data type like DECIMAL(38,38).
The conditions 1 and 2 are ORed.
3. Fixing FORMAT_MAX_DECIMALS from 30 to 38. This was forgotten in 10.2.1
(when the limit for the number of fractional digits in DECIMAL was extended).
In commit fd9ca2a742 (MDEV-23295)
we added a debug assertion, which caught a similar bug.
prepare_inplace_alter_table_dict(): If we had promised that
ALGORITHM=INPLACE or ALGORITHM=NOCOPY is supported, we must
preserve the ROW_FORMAT.
lock_rec_has_to_wait_in_queue(): Remove an obviously redundant assertion
that was added in commit a8ec45863b
and also enclose a Galera-specific condition in #ifdef WITH_WSREP.
Problem:- rpl_parallel2 was failing non-deterministically
Analysis:-
When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker
threads to complete their ongoing transactions and then it will pause them.
At this state FTWRL will proceed to acquire global read lock. FTWRL first
blocks threads from starting new commits, then upgrades the lock to block
commit of existing transactions.
Step1:
FLUSH TABLES WITH READ LOCK - Blocks new commits
Step2:
* STOP SLAVE command enables 'force_abort=1' which unblocks workers,
they continue to execute events.
* T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with
its current GTID, but it is blocked becuase of Step1.
* T2: Holds COMMIT lock and waits for T1 to commit.
Step3:
FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT.
This results in deadlock. When STOP SLAVE command allows paused workers to
proceed, workers should skip the execution of all further events, similar
to 'conservative' parallel mode.
Solution:-
We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait.
rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in
rpl_pause_after_ftwrl.
Lex_input_stream::scan_ident_delimited() could go beyond the end
of the input when a starting backtick (`) delimiter did not have a
corresponding ending backtick.
Fix: catch the case when yyGet() returns 0, which means
either eof-of-query or straight 0x00 byte inside backticks,
and make the parser fail on syntax error, displaying the left
backtick as the syntax error place.
In case of filename in a script like this:
SET CHARACTER_SET_CLIENT=17; -- 17 is 'filename'
SELECT doc.`Children`.0 FROM t1;
the ending backtick was not recognized as such because my_charlen() returns 0 for
a straight backtick (backticks must normally be encoded as @0060 in filename).
The same fix works for 'filename': the execution skips the backtick
and reaches the end of the query, then yyGet() returns 0.
This fix is OK for now. But eventually 'filename' should either be disallowed
as a parser character set, or fixed to handle encoded punctuation properly.
and inaccurately
Analysis: The list of all privileges is 118 characters wide. However, the
format of error message was: "%-.32s command denied to user...". get_length()
sets the maximum width to 32 characters. As a result, only first 32
characters of list of privilege are stored.
Fix: Changing the format to "%-.100T..." so that get_length() sets width to
100. Hence, first 100 characters of the list of privilege are stored and the
type specifier 'T' appends '...' so that truncation can be seen.
Diagnostics_area::sql_errno upon query from I_S with LIMIT ROWS EXAMINED
open_normal_and_derived_table() fails because the query was already killed
as rows examined by the query are more than the limit. However, this isn't a
real error.
Fix: Check if there is actually an error before calling thd->sql_errno()
and later send a warning in handle_select() if no real error.
Type_handler_temporal_result::Item_func_min_max_fix_attributes()
in an expression GREATEST(string,date), e.g:
SELECT GREATEST('1', CAST('2020-12-12' AS DATE));
incorrectly evaluated decimals as 6 (like for DATETIME).
Adding a separate virtual implementation:
Type_handler_date_common::Item_func_min_max_fix_attributes()
This makes the code simpler.
The code in Item_func_int_val::fix_length_and_dec_int_or_decimal()
calculated badly the result data type for FLOOR()/CEIL(), so for example
the decimal(38,10) input created a decimal(28,0) result.
That was not correct, because one extra integer digit is needed.
floor(-9.9) -> -10
ceil(9.9) -> 10
Rewritting the code in a more straightforward way.
Additional changes:
- FLOOR() now takes into account the presence of the UNSIGNED
flag of the argument: FLOOR(unsigned decimal) does not need an extra digits.
- FLOOR()/CEILING() now preserve the unsigned flag in the result
data type is decimal.
These changes give nicer data types.
lock_rec_has_to_wait
wsrep_kill_victim
lock_rec_create_low
lock_rec_add_to_queue
DeadlockChecker::select_victim()
THD can't change from normal transaction to BF (brute force) transaction
here, thus there is no need to syncronize access in wsrep_thd_is_BF
function.
lock_rec_has_to_wait_in_queue
Add condition that lock is not NULL and add assertions if we are in
strong state.
Problem:- rpl_parallel2 was failing non-deterministically
Analysis:-
When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker
threads to complete their ongoing transactions and then it will pause them.
At this state FTWRL will proceed to acquire global read lock. FTWRL first
blocks threads from starting new commits, then upgrades the lock to block
commit of existing transactions.
Step1:
FLUSH TABLES WITH READ LOCK - Blocks new commits
Step2:
* STOP SLAVE command enables 'force_abort=1' which unblocks workers,
they continue to execute events.
* T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with
its current GTID, but it is blocked becuase of Step1.
* T2: Holds COMMIT lock and waits for T1 to commit.
Step3:
FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT.
This results in deadlock. When STOP SLAVE command allows paused workers to
proceed, workers should skip the execution of all further events, similar
to 'conservative' parallel mode.
Solution:-
We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait.
rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in
rpl_pause_after_ftwrl.