The issue here is for a DEPENDENT subquery that has an aggregate function in the ORDER BY clause,
is wrapped inside an Item_aggregate_ref. For computation of ORDER BY we need to refer to the
temp table field corresponding to this item. But in the function make_sortorder, we were
explicitly casting Item_aggrgate_ref to Item_sum, which leads to us not getting the temp
table field corresponding to the item.
Make sure to initialize members of TABLE::reginfo when TABLE::init is called. In this case the problem
was that table->reginfo.join_tab was set for the SELECT query and then was reused by the UPDATE query.
This case occurred only when the SELECT query had a degenerate join.
Services, which is registered with 2 arguments "C:\path\to\mysqld.exe service_name"
would pass non-null terminated command line to win_main(), and this
would crash in defaults handling.
Make sure win_main() gets null-terminated argv.
Quick grouping is not supported for JSON_OBJECTAGG. The same for GROUP_CONCAT too
so make sure that Item::quick_group is set to FALSE. We need to make sure that in
the case of JSON_OBJECTAGG we don't create an index over grouping fields of
the temp table and update the result after each iteration.
Instead we should first sort the result in accordance to the
GROUP BY fields and then perform the grouping and
write the result to the temp table.
The issue here was that the left expr and right expr of the ANY subquery
had different character sets, so we were converting the left expr to utf8 character set.
So when this conversion was happening we were actually converting the item inside the cache,
it looked like <cache>(convert(t1.l1 using utf8)), which is incorrect.
To fix this problem we are going to store the reference of the left expr and convert that
to utf8 character set, it would look like convert(<cache>(`test`.`t1`.`l1`) using utf8)
When high priority replication slave applier encounters lock conflict in innodb,
it will force the conflicting lock holder transaction (victim) to rollback.
This is a must in multi-master sychronous replication model to avoid cluster lock-up.
This high priority victim abort (aka "brute force" (BF) abort), is started
from innodb lock manager while holding the victim's transaction's (trx) mutex.
Depending on the execution state of the victim transaction, it may happen that the
BF abort will call for THD::awake() to wake up the victim transaction for the rollback.
Now, if BF abort requires THD::awake() to be called, then the applier thread executed
locking protocol of: victim trx mutex -> victim THD::LOCK_thd_data
If, at the same time another DBMS super user issues KILL command to abort the same victim,
it will execute locking protocol of: victim THD::LOCK_thd_data -> victim trx mutex.
These two locking protocol acquire mutexes in opposite order, hence unresolvable mutex locking
deadlock may occur.
The fix in this commit adds THD::wsrep_aborter flag to synchronize who can kill the victim
This flag is set both when BF is called for from innodb and by KILL command.
Either path of victim killing will bail out if victim's wsrep_killed is already
set to avoid mutex conflicts with the other aborter execution. THD::wsrep_aborter
records the aborter THD's ID. This is needed to preserve the right to kill
the victim from different locations for the same aborter thread.
It is also good error logging, to see who is reponsible for the abort.
A new test case was added in galera.galera_bf_kill_debug.test for scenario where
wsrep applier thread and manual KILL command try to kill same idle victim
Problem:
========
Relay_log_info::flush reports following MSAN issue.
==17820==WARNING: MemorySanitizer: use-of-uninitialized-value is reported
#5 0x00005584f0981441 in my_write (Filedes=22,
Buffer=0x72500003e818 "5\n./slave-relay-bin.000003\n21385\n
master-bin.000001\n21643\n0\n", '\245' <repeats 141 times>..., Count=118,
MyFlags=532) at /home/sujatha/bug_repo/test-10.5-msan/mysys/my_write.c:49
Analysis:
=========
In parallel replication at the end of each statement execution the worker execution
status is updated in 'relay-log.info' file. When two workers try to flush
the status at the same time, since the write to cache is not serialized both
workers write to the same address simultaneously and increment the
length twice. Because of this the length of buffer is more than actual data.
When flush code tries to read the buffer beyond valid data length MSAN
reports uninitialized values error.
Fix:
===
Serialize the relay log flush operation using "rli->data_lock".
Analysis:
========
When "Profiling" is enabled, server collects the resource usage of each
statement that gets executed in current session. Profiling doesn't support
nested statements. In order to ensure this behavior when profiling is enabled
for a statement, there should not be any other active query which is being
profiled. This active query information is stored in 'current' variable. When
a nested query arrives it finds 'current' being not NULL and server aborts.
When 'init_connect' and 'init_slave' system variables are set they contain a
set of statements to be executed. "execute_init_command" is the function call
which invokes "dispatch_command" for each statement provided in
'init_connect', 'init_slave' system variables. "execute_init_command" invokes
"start_new_query" and it passes the statement list to "dispatch_command". This
"dispatch_command" intern invokes "start_new_query" which leads to nesting of
queries. Hence '!current' assert is triggered.
Fix:
===
Remove profiling from "execute_init_command" as it will be done within
"dispatch_command" execution.
When a transaction fails in certification phase, it has connsumed one GTID, but as
transaction must rollback, it will not go for commit ordering, and because of this
also the wsrep XID checkpointing can happen out of order.
This PR will make the thread, which has failed for certiication failure to wait for its
commit order turn for checkpointing wsrep IXD in innodb rollback segment.
There is a specific test for wsrep XID checkpointing ordering in mtr test:
mysql-wsrep-bugs-607, which is added in this PR.
Test galera_slave_replay depends also on this fix, as the second test phase
may also assert for bad wsrep XID checkpointing order.
galera_slave_replay.test had also other problems, which caused the test to
fail immediately, thse are now fixes in this PR as well.
- select_describe() should not attempt to produce query plans
for subqueries if the query is handled by a Select Handler.
- JOIN::save_explain_data_intern should not add links to Explain_select
for children selects if:
1. The whole query is handled by the Select Handler, or
2. this select (and so its children) is handled by Derived Handler.
MDL_lock::Ticket_list::remove_ticket(): reduce algoritmic
complexity from O(N) to O(1)
MDL_lock::Ticket_list::clear_bit_if_not_in_list(): removed
MDL_lock::Ticket_list::m_type_counters: a map of ticket type
to count. Initialization is memset(0) which takes time.
Starting from 10.3, the optimizer is able to detect that entire outer join
nests are constants (because of "Impossible ON") and remove them (see
mark_join_nest_as_const)
However, this was not properly accounted for in NESTED_JOIN structure
and the way check_interleaving_with_nj() uses its n_tables member to
check if the join prefix order is allowed.
(The result was that the optimizer could conclude that no join prefix is
allowed and fail an assertion)
The parser must reject DDL operations on temporary objects when
they may modify or alter such object, including temporary tables and sequences.
The rejection is regardless (has been already in place for bin-loggable DML:s)
of the binlogging capability of the server or connection.
The patch implements the requirement. A binlog test is added.
Item_func_json_extract did not implement val_decimal(),
so CAST(JSON_EXTRACT('{"x":true}', '$.x') AS DECIMAL) erroneously
returned 0 with a warning because of convertion from the string "true"
to decimal.
Implementing val_decimal(), so boolean values are correctly handled.
The code in fill_schema_schemata() did not take into account that
make_db_list() can leave empty db_names if the requested database
name was too long, so the call for db_names.at(0) crashed on assert.
- Moving the code testing if the database directory exists
into a separate function verify_database_directory_exists()
- Modifying the test to check if db_names is not empty
When converting a table (test.s3_table) from S3 to another engine, the
following will be logged to the binary log:
DROP TABLE IF EXISTS test.t1;
CREATE OR REPLACE TABLE test.t1 (...) ENGINE=new_engine
INSERT rows to test.t1 in binary-row-log-format
The bug is that the above statements are logged one by one to the binary
log. This means that a fast slave, configured to use the same S3 storage
as the master, would be able to execute the DROP and CREATE from the
binary log before the master has finished the ALTER TABLE.
In this case the slave would ignore the DROP (as it's on a S3 table) but
it will stop on CREATE of the local tale, as the table is still exists in
S3. The REPLACE part will be ignored by the slave as it can't touch the
S3 table.
The fix is to ensure that all the above statements is written to binary
log AFTER the table has been deleted from S3.
- Rewrote bool Query_compressed_log_event::write() to make it more readable
(no logic changes).
- Changed DBUG_PRINT of 'is_error:' to 'is_error():' to make it easier to
find error: in traces.
- Ensure that 'db' is never null in Query_log_event (Simplified code).
copy_data_between_tables() sets to->s->default_fields to 0, as a part
of the code disabling ON UPDATE actions for all old fields
(so ON UPDATE is enable only for new fields during copying).
After the actual copying, copy_data_between_tables() did not restore
to->s->default_fields to the original value. As a result, the
TABLE_SHARE to->s was left in a wrong state after copy_data_between_tables()
and further open_table_from_share() using this TABLE_SHARE did not
populate TABLE::default_field, which further made
TABLE::evaluate_update_default_function() crash on access to NULL
pointer.
Fix:
Changing copy_data_between_tables() to restore to->s->default_fields
to its original value after the copying loop.
Problem:- When we issue FTWRL with shutdown in parallel, there is race between
FTWRL and shutdown. Shutdown might destroy the mutex (pool->LOCK_rpl_thread_pool)
before FTWRL can lock it. So we can get crash on FTWRL thread
Solution:- mysql_mutex_destroy(pool->LOCK_rpl_thread_pool) should wait for
FTWRL thread to complete its work , and then destroy.
So slave_prepare_for_shutdown will just deactivate the pool, and mutex is destroyed
later in end_slave()
Problem:
=======
The "Start binlog_dump" message hasn't been updated to include the slave's
requested GTID position:
20:05:05 139836760311552 [Note] Start binlog_dump to slave_server(2), pos(, 4)
For diagnostic purposes, it would be helpful if the GTID position were
included.
Fix:
===
Imporve "Start binlog_dump" print message to include "using_gtid" and
"GTID position" requested by slave.
Ex:
[Note] Start binlog_dump to slave_server(2), pos(, 4), using_gtid(1),
gtid('1-1-201,2-2-100')
[Note] Start binlog_dump to slave_server(3), pos('mariadb-bin.004142',
507988273), using_gtid(0), gtid('')
After this code
end_inplace:
if (thd->locked_tables_list.reopen_tables(thd, false))
goto err_with_mdl_after_alter;
table is not reopened (need_reopen is false) but
some_table_marked_for_reopen is reset to false.
Item_field is allocated on table lock and assigned new name on first
ALTER which is then freed at the end of the command. Second ALTER
accessess this Item_field and gets garbage value.
Problem was that FLUSH TABLES where trying to read latest sequence state
which conflicted with a running ALTER SEQUENCE. Removed the reading
of the state, when opening a table for FLUSH, as it's not needed in this
case.
Other thing:
- Fixed a potential issue with concurrently running ALTER SEQUENCE where
the later ALTER could potentially read old data