Problem:
========
CHANGE MASTER TO MASTER_USER='root', MASTER_SSL=0, MASTER_SSL_CA='',
MASTER_SSL_CERT='', MASTER_SSL_KEY='', MASTER_SSL_CRL='',
MASTER_SSL_CRLPATH='';
CHANGE MASTER TO MASTER_USER='root', MASTER_PASSWORD='', MASTER_SSL=0;
use-after-poison is reported for lex_mi->ssl_crl
File: sql_repl.cc
if (lex_mi->ssl_crl)
strmake_buf(mi->ssl_crl, lex_mi->ssl_crl);
Analysis:
========
At the end of CHANGE MASTER statement execution, the LEX_MASTER_INFO
parameters are reset so that the next query will have a clean state. But
'ssl_crl' and 'ssl_crl_path' members of LEX_MASTER_INFO object are not
cleared during 'LEX_MASTER_INFO::reset'. Hence when a new CHANGE MASTER
statement is executed, the stale value of lex_mi->ssl_crl is used, so ASAN
reports use-after-poison.
Fix:
===
Clear 'ssl_crl' and 'ssl_crl_path' as part of 'reset'.
Since 2017 (c2118a08b1) THD::awake() no longer requires LOCK_thd_data.
It uses LOCK_thd_kill, and this latter mutex is used to prevent
a thread of dying, not LOCK_thd_data as before.
For an IN/ANY/ALL subquery without an aggregate function and HAVING clause,
the GROUP BY clause is removed.
Due to the GROUP BY list being removed, the invalid reference in the GROUP BY
clause was never resolved.
Remove the GROUP BY list only when the all the items in the GROUP BY list
are resolved.
Also removing the GROUP BY list later would not affect the extension that allows
using non-aggregated field in an aggregate function (when ONLY_FULL_GROUP_BY
is not set) because the GROUP BY list is removed only when their is
NO aggregate function in IN/ALL/ANY subquery.
For BIT columns when EITS is collected, we store the integral value in
text representation in the min and max fields of the statistical table
When this value is retrieved from the statistical table to original table
field then we try to store the text representation in the original field
which is INCORRECT.
The value that is retrieved should be converted to integral type and that
value should be stored back in the original field. This would get us the
correct estimate for selectivity of the predicate.
The issue happens when the secondary keys are extended with primary
key parts. Inside the function TABLE_SHARE::init_from_binary_frm_image()
adds the length bytes for the primary key key parts to the length of the
secondary key. This is not needed because when the extended keys are
used we recalculate the length for the used key parts.
Also removed TABLE_SHARE::total_key_length as it is not used in the code
Apporved-by: Monty <monty@mariadb.org>
Apply the patch based on the patch by Varun Gupta:
PARAM::is_ror_scan might be used unitialized when check_quick_select()
is invoked for a "degenerate" SEL_ARG tree (e.g. one having type
SEL_ARG::IMPOSSIBLE).
Make check_quick_select() always initialize PARAM::is_ror_scan.
Problem is that not all plugins are loaded when wsrep_recover is executed.
Thus, we allow unknown system variables and extra system variables during
wsrep_recover. Any unknown system variables would still be caught when
the server starts up normally after the SST.
This patch actually fixes the bug MDEV-24675 and the bug MDEV-24618:
Assertion failure when TVC uses a row in the context expecting scalar value
The cause of these bugs is the same wrong call of the function that fixes
value expressions in the value list of a table value constructor.
The assertion failure happened when an expression in the value list is of
the row type. In this case an error message was expected, but it was not
issued because the function fix_fields_if_needed() was called for to
check fields of value expressions in a TVC instead of the function
fix_fields_if_needed_for_scalar() that would also check that the value
expressions are are of a scalar type.
The first bug happened when a table value expression used an expression
returned by single-row subselect. In this case the call of the
fix_fields_if_needed_for_scalar virtual function must be provided with
and address to which the single-row subselect has to be attached.
Test cases were added for each of the bugs.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.
This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.
The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.
To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
orig_*, all_set bitmaps are never substituted already.
This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
MY_BITMAP* instead of my_bitmap_map*
These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
Cause: no table->update_handler cloned at the moment of
vers_insert_history_row(). update_handler is needed because there
can't be several inited indexes at once in the same handler. First
index is inited by QUICK_RANGE_SELECT::reset(). Then when history row
is inserted check_duplicate_long_entry_key() is done and it requires
another index.
mutex order violation here.
when wsrep bf thread kills a conflicting trx, the stack is
wsrep_thd_LOCK()
wsrep_kill_victim()
lock_rec_other_has_conflicting()
lock_clust_rec_read_check_and_lock()
row_search_mvcc()
ha_innobase::index_read()
ha_innobase::rnd_pos()
handler::ha_rnd_pos()
handler::rnd_pos_by_record()
handler::ha_rnd_pos_by_record()
Rows_log_event::find_row()
Update_rows_log_event::do_exec_row()
Rows_log_event::do_apply_event()
Log_event::apply_event()
wsrep_apply_events()
and mutexes are taken in the order
lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data
When a normal KILL statement is executed, the stack is
innobase_kill_query()
kill_handlerton()
plugin_foreach_with_mask()
ha_kill_query()
THD::awake()
kill_one_thread()
and mutexes are
victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex
To fix the mutex order violation we kill the victim thd asynchronously,
from the manager thread
* provide an argument to the callback
* don't ignore a callback request if it's already present in the queue
* initialize mutex/cond/in_use flag before starting the thread,
in case the first callback queueing request arrives before
handle_manager had time to initialize
* set/check abort_manager under a mutex, otherwise handle_manager
thread might destroy LOCK_manager before stop_handle_manager
released it
* signal COND on queueing a callback, stop cond_wait on callback request
* always start the thread, even if flush_time is 0
* but keep the old behavior in embedded (no replication, no galera)
* style cleanups (e.g. remove volatile for a variable protected by a mutex)
We need to complete SST if both new and old start positions are
not same as initial positions. If they are initial positions
just set local uuid and seqno.
Sample log error message generated:
2021-01-21 2:33:24 139912137520896 [Note] Slave SQL thread exiting, replication stopped in log 'master-bin.000001' at position 369
33:24 139912137520896 [Note] master was 127.0.0.1:16400
2021-01-21 2:33:24 139912137828096 [Note] Slave I/O thread exiting, read up to log 'master-bin.000001', position 369
2021-01-21 2:33:24 139912137828096 [Note] master was 127.0.0.1:16400
Based on work by Hartmut Holzgraefe.
Reviewer: knielsen@knielsen-hq.org, Andrei, Sachin
I_S tables were materialized too late, an attempt to use table
statistics before the table was created caused a crash.
Let's move table creation up. it only needs read_set to
be calculated properly, this happens in JOIN::optimize_inner(),
after semijoin transformation.
Note that tables are not populated at that point, so most of the
statistics would make no sense anyway. But at least field sizes
will be correct. And it won't crash.
There were multiple problems here
* wsrep_trx_fragment_size should not be set when wsrep is disabled or provider is not loaded
* wsrep_trx_fragment_unit should not be set when wsrep is disabled or provider is not loaded
* wsrep_debug has no effect if wsrep is disabled or provider is not loaded
* wsrep_start_position should not be set when wsrep is disabled or provider is not loaded any other value than default
* wsrep_start_position should be changed only when we are joiner or initialized
* wsrep_start_position should be allowed to set only a value that exits, thus
we need to add error handling to wsrep_sst_complete
Problem:
========
Auto purge of relaylogs stops when relay-log-file is
'slave-relay-log.999999' and slave_parallel_threads is enabled.
Analysis:
=========
The problem is that in Relay_log_info::inc_group_relay_log_pos() function,
when two log names are compared via strcmp() function, it gives correct
result, when log name sequence numbers are of same digits(6 digits), But
when the number goes to 7 digits, a 999999 compares greater than
1000000, which is wrong, hence the bug.
Fix:
====
Extract the numeric extension part of the file name, convert it into
unsigned long and compare.
Thanks to David Zhao for the contribution.
Fix for MDEV-23033 fixes a problem in replication applying of transactions, which contain cascading foreign key delete for a table, which has indexed virtual column.
This fix adds slave_fk_event_map flag for table, to mark when the prelocking is needed for applying of a transaction.
See commit 608b0ee52e for more details.
However, this fix is targeted for async replication only, Rows_log_event::do_apply_event() has condition to rule out galera replication from the fix domain, and use cases suffering from MDEV-23033 and related MDEV-21153 will fail in galera cluster.
The fix in this commit removes the condition to rule out the setting of slave_fk_event_map flag from galera replication, and makes the fix in MDEV-23033 effective for galera replication as well.
However, the above fix has caused regressions for some galera_sr suite tests, which run tests for streaming replication.
This regression can be observed e.g. by: /mtr galera_sr.galera_sr_multirow_rollback --mysqld=--slave_run_triggers_for_rbr=yes
These galera_sr suite tests were failing in last phase of replication applying, where actual transaction is already applied, and streaming replication related meta data needs to be updated in wsrep system tables.
Opening the wsrep system tables failed for corrupt data in THD::lex:query_tables_list. The fix in this commit uses back query table list for the duration of fragment update operation.
Finally, a mtr test for virtual column support has been added. galera.galera_virtual_column.test has as first test a scenario from MDEV-21153
new fix
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
through 2nd execution of SP
This bug caused a server crash on the second call of any stored procedure
that contained an UPDATE statement over a multi-table view reporting an
error message at the prepare stage.
On the first call of the stored procedure after reporting an error at
the preparation stage of the UPDATE statement finished without calling
the function SELECT_LEX::save_prep_leaf_tables() for the SELECT used as
the definition of the view. This left the SELECT_LEX structure used by
the UPDATE statement in an inconsistent state for second call of the stored
procedure.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
m_status == DA_OK_BULK' failed in Diagnostics_area::message()
Analysis: Assertion failure happens because we reach the maximum limit to
examine rows.
Fix: Return the error state.
Fix for MDEV-23033 fixes a problem in replication applying of transactions, which contain cascading foreign key delete for a table, which has indexed virtual column.
This fix adds slave_fk_event_map flag for table, to mark when the prelocking is needed for applying of a transaction.
See commit 608b0ee52e for more details.
However, this fix is targeted for async replication only, Rows_log_event::do_apply_event() has condition to rule out galera replication from the fix domain, and use cases suffering from MDEV-23033 and related MDEV-21153 will fail in galera cluster.
The fix in this commit removes the condition to rule out the setting of slave_fk_event_map flag from galera replication, and makes the fix in MDEV-23033 effective for galera replication as well.
Finally, a mtr test for virtual column support has been added. galera.galera_virtual_column.test has as first test a scenario from MDEV-21153
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Some DML operations on tables having unique secondary keys cause scanning
in the secondary index, for instance to find potential unique key violations
in the seconday index. This scanning may involve GAP locking in the index.
As this locking happens also when applying replication events in high priority
applier threads, there is a probabality for lock conflicts between two wsrep
high priority threads.
This PR avoids lock conflicts of high priority wsrep threads, which do
secondary index scanning e.g. for duplicate key detection.
The actual fix is the patch in sql_class.cc:thd_need_ordering_with(), where
we allow relaxed GAP locking protocol between wsrep high priority threads.
wsrep high priority threads (replication appliers, replayers and TOI processors)
are ordered by the replication provider, and they will not need serializability
support gained by secondary index GAP locks.
PR contains also a mtr test, which exercises a scenario where two replication
applier threads have a false positive conflict in GAP of unique secondary index.
The conflicting local committing transaction has to replay, and the test verifies
also that the replaying phase will not conflict with the latter repllication applier.
Commit also contains new test scenario for galera.galera_UK_conflict.test,
where replayer starts applying after a slave applier thread, with later seqno,
has advanced to commit phase. The applier and replayer have false positive GAP
lock conflict on secondary unique index, and replayer should ignore this.
This test scenario caused crash with earlier version in this PR, and to fix this,
the secondary index uniquenes checking has been relaxed even further.
Now innodb trx_t structure has new member: bool wsrep_UK_scan, which is set to
true, when high priority thread is performing unique secondary index scanning.
The member trx_t::wsrep_UK_scan is defined inside WITH_WSREP directive, to make
it possible to prepare a MariaDB build where this additional trx_t member is
not present and is not used in the code base. trx->wsrep_UK_scan is set to true
only for the duration of function call for: lock_rec_lock() trx->wsrep_UK_scan
is used only in lock_rec_has_to_wait() function to relax the need to wait if
wsrep_UK_scan is set and conflicting transaction is also high priority.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
Actual assertion mentioned on MDEV seems to be already fixed but
setting seqno to -2 will trigger a different assertion
mysqld: /home/jan/mysql/10.4-bugs/wsrep-lib/src/server_state.cpp:702: void wsrep::server_state::sst_received(wsrep::client_service&, int): Assertion `state_ == s_joiner || state_ == s_initialized' failed.
Fixed this by not allowing user to set seqno < -1 (-1 is special
seqno meaning undefined and seqno is initialized to it). MariaDB
releases 10.2 and 10.3 already do not allow to set seqno < -1.
On parsing statements for which a starting backtick (`) delimiter doesn't have
a corresponding ending backtick, a current pointer to a position inside a
pre-processed buffer could go beyond the end of the buffer.
This bug report caused by the commit d496765903
"MDEV-22022 Various mangled SQL statements will crash 10.3 to 10.5 debug builds".
In order to fix the issue both pointers m_ptr and m_cpp_ptr must be
rolled back to previous position in raw input and pre-processed input streams
correspondingly in case end of query reached during parsing.
constellation
Analysis: The decimals is set to NOT_FIXED_DEC for Field_str even if it is
NULL. Unsigned has decimals=0. So Type_std_attributes::decimals is set to 39
(maximum between 0 and 39). This results in incorrect number of decimals
when we have union of unsigned and NULL type.
Fix: Check if the field is created from NULL value. If yes, set decimals to 0
otherwise set it to NOT_FIXED_DEC.
Introduced val_time_packed and val_datetime_packed functions for Item_direct_ref
to make sure to get the value from the item it is referring to.
The issue for incorrect result was that the item was getting its value
from the temporary table rather than from the view.
For EITS collection min and max fields are allocated for each column
that is set in the read_set bitmap of a table. This allocation of min and max
fields happens inside alloc_statistics_for_table.
For a partitioned table ha_rnd_init is called inside the function
collect_statistics_for_table which sets the read_set bitmap for the columns
inside the partition expression. This happens only when there is a write lock
on the partitioned table.
But the allocation happens before this, so min and max fields are not allocated
for the columns involved in the partition expression.
This resulted in a crash, as the EITS statistics were collected but there was
no min and max field to store the value to.
The fix would be to call ha_rnd_init inside the function alloc_statistics_for_table
that would make sure that min and max fields are allocated for the columns
involved in the partition expression.
don't expect return type of a stored function to be valid.
it's read from a table, so can be messed with.
it even can contain \0 bytes in the middle of the type name
The issue here was we were trying to push an extracted condition for a view into the
underlying table value constructor inside the view.
The fix would be to not push conditions into table value constructors.
* be strict in CREATE TABLE, just like in ALTER TABLE, because
CREATE TABLE, just like ALTER TABLE, can be rolled back for any engine
* but don't auto-convert warnings into errors for engine warnings
(handler::create) - this matches ALTER TABLE behavior
* and not when creating a default record, these errors are handled
specially (and replaced with ER_INVALID_DEFAULT)
* always issue a Note when a non-unique key is truncated, because it's
not a Warning that can be converted to an Error. Before this commit
it was a Note for blobs and a Warning for all other data types.