Commit graph

71825 commits

Author SHA1 Message Date
Jan Lindström
208233be5a MDEV-24830 : Write a warning to error log if Galera replicates InnoDB table with no primary key
Two new features for Galera
* Write a warning to error log if Galera replicates table with storage engine not supported by Galera (at the moment only InnoDB is supported
** Warning is pushed to client also
** MyISAM is allowed if wsrep_replicate_myisam=ON
* Write a warning to error log if Galera replicates table with no primary key
** Warning is pushed to client also
** MyISAM is allowed if wsrep_relicate_myisam=ON
* In both cases apply flood control if > 10 same warning is writen to error log
(requires log_warnings > 1), flood control will suppress warnings for 300 seconds
2021-02-22 13:51:58 +02:00
Marko Mäkelä
94b4578704 Merge 10.5 into 10.6 2021-02-17 19:39:05 +02:00
Vladislav Vaintroub
e92c34ce34 Keep old GCC quiet. 2021-02-17 16:14:38 +01:00
Marko Mäkelä
16388f393c Merge mariadb-10.5.9 2021-02-17 16:19:49 +02:00
Sergei Golubchik
ae7989ca20 galera.galera_gra_log crashes
reset thd->lex->query_tables_own_last,
because open_table() uses it and will try to dereference
whatever garbage it might have
2021-02-16 01:11:41 +01:00
Monty
fc5e03f093 Ignore reporting in thd_progress_report() if we cannot lock LOCK_thd_data
The reason for this is that Galera can lock LOCK_thd_data for a long time.

Instead of stalling any long running process, like alter or repair table,
because of progress reporting, ignore the progress reporting for this
call. Progress reporting will continue on the next call after the lock has
been released.
2021-02-15 20:21:29 +02:00
Sergei Golubchik
25d9d2e37f Merge branch 'bb-10.4-release' into bb-10.5-release 2021-02-15 16:43:15 +01:00
Monty
34c654024c MDEV-24855 ER_CRASHED_ON_USAGE or Assertion `length <= column->length'
When creating a summary temporary table with bit fields used in the sum
expression with several parameters, like GROUP_CONCAT(), the counting of
bits needed in the record was wrong.

The reason we got an assert in Aria was because the bug caused a memory
overwrite in the record and Aria noticed that the data was 'impossible.
2021-02-15 01:33:06 +02:00
Sergei Golubchik
2696538723 updating @@wsrep_cluster_address deadlocks
wsrep_cluster_address_update() causes LOCK_wsrep_slave_threads
to be locked under LOCK_wsrep_cluster_config, while normally
the order should be the opposite.

Fix: don't protect @@wsrep_cluster_address value with the
LOCK_wsrep_cluster_config, LOCK_global_system_variables is enough.

Only protect wsrep reinitialization with the LOCK_wsrep_cluster_config.
And make it use a local copy of the global @@wsrep_cluster_address.

Also, introduce a helper function that checks whether
wsrep_cluster_address is set and also asserts that it can be safely
read by the caller.
2021-02-14 23:18:42 +01:00
Vladislav Vaintroub
4df0249b9a MDEV-24341 Innodb - do not block in foreground thread in log_write_up_to( 2021-02-14 18:30:39 +01:00
Jan Lindström
b3df194e31 MDEV-24833 : Signal 11 on wsrep_can_run_in_toi at wsrep_mysqld.cc:1994
Problem was that when engine substitution is allowd (e.g. sql_mode='')
we must also check db_type. Additionally, we did not resolve
default storage engine on that case and used that to check is
TOI possible or not.
2021-02-12 20:43:03 +02:00
Sergei Golubchik
b91e77cff3 fix a 3-way deadlock in galera_sr.galera-features#56
rarely (try --repeat 1000), the following happens:

* from wsrep_bf_abort (when a thread is being killed), wsrep-lib
starts streaming_rollback that wants to
convert_streaming_client_to_applier. wsrep_create_streaming_applier
creates a new THD(). All while the other THD is being killed,
so under LOCK_thd_kill and LOCK_thd_data. In particular, THD::init()
takes LOCK_global_system_variables under LOCK_thd_kill.

* updating @@wsrep_slave_threads takes LOCK_global_system_variables
and LOCK_wsrep_cluster_config (in that order) and invokes
wsrep_slave_threads_update() that takes LOCK_wsrep_slave_threads

* wsrep_replication_process() takes LOCK_wsrep_slave_threads and
invokes wsrep_close_applier(), that does thd->set_killed() which
takes LOCK_thd_kill.

et voilà.

As a fix I copied a workaround from wsrep_cluster_address_update()
to wsrep_slave_threads_update(). It seems to be safe: without mutexes
a race condition is possible and a concurrent SET might change
wsrep_slave_threads, but wsrep_slave_threads_update() always verifies
if there's a need to do something, so it will not run twice in this case,
it'll be a no-op.
2021-02-12 18:51:25 +01:00
Sergei Golubchik
259b945204 remove find_thread_with_thd_data_lock_callback
let the caller take the lock if needed
2021-02-12 18:17:07 +01:00
Sergei Golubchik
eac8341df4 MDEV-23328 Server hang due to Galera lock conflict resolution
adaptation of 29bbcac0ee for 10.4
2021-02-12 18:17:06 +01:00
Sergei Golubchik
9703cffa8c don't take mutexes conditionally 2021-02-12 18:14:20 +01:00
Sergei Golubchik
259a1902a0 cleanup: THD::abort_current_cond_wait()
* reuse the loop in THD::abort_current_cond_wait, don't duplicate it
* find_thread_by_id should return whatever it has found, it's the
  caller's task not to kill COM_DAEMON (if the caller's a killer)

and other minor changes
2021-02-12 18:05:34 +01:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Marko Mäkelä
b19ec8848c Merge 10.5 into 10.6 2021-02-11 09:26:53 +02:00
Daniel Black
5e3d3220bb MDEV-24344: BINLOG REPLAY privilege is missing from SHOW PRIVILEGES
Was added in 10.5.2 (MDEV-21975)
2021-02-09 19:41:00 +11:00
Monty
ffc5d06489 MDEV-24087 s3.replication_partition fails in buildbot wiht replication failure
A few of the failures was because of missing sync_slave_to_master in
the test suite.

However, the biggest reason for most faulures was that in case of
ALTER PARTITION the master writes the query to the binary log before
it has updated the .frm and .par files. This causes a problem for an
S3 slave as it will start execute the ALTER PARTITION but get old .frm and
.par files from S3 which causes "open table" to fail, either with an error
or in some case with a crash.
Fixed
2021-02-08 21:03:04 +02:00
Monty
5d6ad2ad66 Added 'const' to arguments in get_one_option and find_typeset()
One should not change the program arguments!
This change also reduces warnings from the icc compiler.

Almost all changes are just syntax changes (adding const to
'get_one_option function' declarations).

Other changes:
- Added a few cast of 'argument' from 'const char*' to 'char *'. This
  was mainly in calls to 'external' functions we don't have control of.
- Ensure that all reset of 'password command line argument' are similar.
  (In almost all cases it was just adding a comment and a cast)
- In mysqlbinlog.cc and mysqld.cc there was a few cases that changed
  the command line argument. These places where changed to instead allocate
  the option in a MEM_ROOT to avoid changing the argument. Some of this
  code was changed to ensure that different programs did parsing the
  same way. Added a test case for the changes in mysqlbinlog.cc
- Changed a few variables that took their value from command line options
  from 'char *' to 'const char *'.
2021-02-08 12:16:29 +02:00
Sergei Golubchik
ef5adf5207 MDEV-24274 ALTER TABLE with CHECK CONSTRAINTS gives "Out of Memory" error
partially revert 76063c2a13. Item::clone() is not an all-purpose
Item copying machine, it was specifically created for pushdown
of predicates into derived tables and views and it does not
copy everything. In particular, it does not copy Item_func_regex.

Fix the bug differently by preserving the old constraint name.
But keep setting automatic_name=true to have it regenerated
for cases like ALTER TABLE ... ADD CONSTRAINT.
2021-02-07 16:22:11 +01:00
Marko Mäkelä
1110beccd4 Merge 10.5 into 10.6 2021-02-02 15:15:53 +02:00
Sergei Golubchik
6212cf86a2 galera fixes related to THD::LOCK_thd_kill
win
2021-02-02 14:08:07 +01:00
Sergei Golubchik
37e24970cb merge 2021-02-02 12:43:17 +01:00
Sergei Golubchik
2676c9aad7 galera fixes related to THD::LOCK_thd_kill
Since 2017 (c2118a08b1) THD::awake() no longer requires LOCK_thd_data.
It uses LOCK_thd_kill, and this latter mutex is used to prevent
a thread of dying, not LOCK_thd_data as before.
2021-02-02 10:02:17 +01:00
Vladislav Vaintroub
b5dab19efa MDEV-24757 : fix potential null pointer dereference in I_S.thread_pool_queues
With the fix, new connection without THDs will now show NULL in
thread_pool_queues.CONNECTION_ID.
2021-02-02 00:19:52 +01:00
Oleksandr Byelkin
bbbe7e781f sync changes in oracle parser 2021-02-01 13:50:10 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Sergei Petrunia
73c43ee9ed MDEV-24739: Assertion `root->weight >= ...' failed in SEL_ARG::tree_delete
Also update the SEL_ARG graph weight in:
- sel_add()
- SEL_ARG::clone()

Make key_{and,or}_with_limit() to also verify weight for the arguments
(There is no single point to verify SEL_ARG graphs constructed from
conditions that are not AND-OR formulas, so we hope that those are
connected with AND/OR and do it here).
2021-01-30 00:54:21 +03:00
Sergei Petrunia
c36720388d MDEV-9750: Quick memory exhaustion with 'extended_keys=on' ...
(Variant #5, full patch, for 10.5)

Do not produce SEL_ARG graphs that would yield huge numbers of ranges.
Introduce a concept of SEL_ARG graph's "weight". If we are about to
produce a graph whose "weight" exceeds the limit, remove the parts
of SEL_ARG graph that represent the biggest key parts. Do so until
the graph's is within the limit.

Includes
- debug code to verify SEL_ARG graph weight
- A user-visible @@optimizer_max_sel_arg_weight to control the optimization
- Logging the optimization into the optimizer trace.
2021-01-29 16:20:57 +03:00
Oleksandr Byelkin
17867608a2 ASAN heap-use-after-free in Item_exists_subselect::is_top_level_item
check that we can do type casting
2021-01-29 11:18:06 +01:00
sjaakola
a2eb974b50 MDEV-24721 galera.mysql-wsrep-bugs-607 test failure
The implementation for MDEV-17048 apperas to be direct copy from mysql version.
The group commit works differently in mariadb and the assert in wsrep_unregister_from_group_commit() is too strict.

The reason is that in: Wsrep_high_priority_service::log_dummy_write_set(), the transaction will undergo full rollback:
    {
      cs.before_rollback();
      cs.after_rollback();
    }

After that, the client's transaction state is set to be:  wsrep::transaction::s_aborted.
The execution then continues execution by:

...
 wsrep_register_for_group_commit(m_thd);
...
 wsrep_unregister_from_group_commit(m_thd);

The bogus assert in wsrep_unregister_from_group_commit() allows only transactions states of :s_ordered_commit or s_aborting.

As the fix, I brought back the same assert as is present in MariaDB 10.4 version.
2021-01-29 12:14:08 +02:00
Sergei Petrunia
33ede50f20 MDEV-22251: get_key_scans_params: Conditional jump or move depends on uninitialised value
Apply the patch based on the patch by Varun Gupta:

PARAM::is_ror_scan might be used unitialized when check_quick_select()
is invoked for a "degenerate" SEL_ARG tree (e.g. one having type
SEL_ARG::IMPOSSIBLE).

Make check_quick_select() always initialize PARAM::is_ror_scan.
2021-01-28 20:46:13 +03:00
Marko Mäkelä
c411393a84 MDEV-24715 Assertion !node->table->skip_alter_undo in CREATE...REPLACE SELECT
In commit 3cef4f8f0f (MDEV-515)
we inadvertently broke CREATE TABLE...REPLACE SELECT statements
by wrongly disabling row-level undo logging.

select_create::prepare(): Only invoke extra(HA_EXTRA_BEGIN_ALTER_COPY)
if no special treatment of duplicates is needed.
2021-01-28 15:26:53 +02:00
Marko Mäkelä
6d1f1b61b5 MDEV-24564 Statistics are lost after ALTER TABLE
Ever since commit 007f68c37f,
ALTER TABLE no longer invokes handler::open() after
handler::commit_inplace_alter_table().

ha_innobase::reload_statistics(): Reload or recompute statistics
after ALTER TABLE.

innodb_notify_tabledef_changed(): A new function to invoke
ha_innobase::reload_statistics().

handlerton::notify_tabledef_changed(): Add the parameter handler*
so that ha_innobase::reload_statistics() can be invoked.

ha_partition::notify_tabledef_changed(),
partition_notify_tabledef_changed(): Pass through the call
to any partitions or subpartitions.

This is based on code that was supplied by Monty.
2021-01-28 14:15:01 +02:00
Jan Lindström
20f6c403eb MDEV-20717 : Plugin system variables and activation options can break "mysqld --wsrep_recover"
Problem is that not all plugins are loaded when wsrep_recover is executed.
Thus, we allow unknown system variables and extra system variables during
wsrep_recover. Any unknown system variables would still be caught when
the server starts up normally after the SST.
2021-01-28 12:48:19 +02:00
Igor Babaev
bdae8bb6fd MDEV-24675 Server crash when table value constructor uses a subselect
This patch actually fixes the bug MDEV-24675 and the bug MDEV-24618:
Assertion failure when TVC uses a row in the context expecting scalar value

The cause of these bugs is the same wrong call of the function that fixes
value expressions in the value list of a table value constructor.
The assertion failure happened when an expression in the value list is of
the row type. In this case an error message was expected, but it was not
issued because the function fix_fields_if_needed() was called for to
check fields of value expressions in a TVC instead of the function
fix_fields_if_needed_for_scalar() that would also check that the value
expressions are are of a scalar type.
The first bug happened when a table value expression used an expression
returned by single-row subselect. In this case the call of the
fix_fields_if_needed_for_scalar virtual function must be provided with
and address to which the single-row subselect has to be attached.

Test cases were added for each of the bugs.

Approved by Oleksandr Byelkin <sanja@mariadb.com>
2021-01-26 18:22:39 -08:00
Nikita Malyavin
21809f9a45 MDEV-17556 Assertion `bitmap_is_set_all(&table->s->all_set)' failed
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.

This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
 bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
 sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
 directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.

The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.

To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
 orig_*, all_set bitmaps are never substituted already.

This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
 to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
 MY_BITMAP* instead of my_bitmap_map*

These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
2021-01-27 00:50:55 +10:00
Roman Nozdrin
0565d19973 MDEV-24298 Select Handler now process INSERT..SELECT with a single derived at
the top level
2021-01-26 14:07:14 +00:00
mkaruza
95a2bca01f MDEV-20008: Galera strict mode
Added new enum variable `wsrep_mode` which can be used to turn on WSREP
features which are not part of default behaviour.
Added enum `BINLOG_ROW_FORMAT_ONLY`, `REQUIRED_PRIMARY_KEY` and
`STRICT_REPLICATION`. `wsrep-mode=STRICT_REPLICATION` behaves
like variable `wsrep_strict_ddl`.

Variable wsrep_strict_ddl is deprecated and if set we use
new wsrep_mode setting instead.

Reviewed and improved by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-01-26 14:02:24 +02:00
Aleksey Midenkov
1398160a71 MDEV-24522 Assertion `inited==NONE' fails upon UPDATE on versioned table with unique blob
Cause: no table->update_handler cloned at the moment of
vers_insert_history_row(). update_handler is needed because there
can't be several inited indexes at once in the same handler. First
index is inited by QUICK_RANGE_SELECT::reset(). Then when history row
is inserted check_duplicate_long_entry_key() is done and it requires
another index.
2021-01-26 14:41:23 +03:00
Jan Lindström
8cdeee177d MDEV-24509 : Warning: Memory not freed: 56 on SET @@global.wsrep_sst_auth
It seems that memory is not freed when updated value is NULL or
when wsrep is not initialized before shutdown.
2021-01-26 10:02:51 +02:00
Marko Mäkelä
3cef4f8f0f MDEV-515 Reduce InnoDB undo logging for insert into empty table
We implement an idea that was suggested by Michael 'Monty' Widenius
in October 2017: When InnoDB is inserting into an empty table or partition,
we can write a single undo log record TRX_UNDO_EMPTY, which will cause
ROLLBACK to clear the table.

For this to work, the insert into an empty table or partition must be
covered by an exclusive table lock that will be held until the transaction
has been committed or rolled back, or the INSERT operation has been
rolled back (and the table is empty again), in lock_table_x_unlock().

Clustered index records that are covered by the TRX_UNDO_EMPTY record
will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot
be distinguished from what MDEV-12288 leaves behind after purging the
history of row-logged operations.

Concurrent non-locking reads must be adjusted: If the read view was
created before the INSERT into an empty table, then we must continue
to imagine that the table is empty, and not try to read any records.
If the read view was created after the INSERT was committed, then
all records must be visible normally. To implement this, we introduce
the field dict_table_t::bulk_trx_id.

This special handling only applies to the very first INSERT statement
of a transaction for the empty table or partition. If a subsequent
statement in the transaction is modifying the initially empty table again,
we must enable row-level undo logging, so that we will be able to
roll back to the start of the statement in case of an error (such as
duplicate key).

INSERT IGNORE will continue to use row-level logging and locking, because
implementing it would require the ability to roll back the latest row.
Since the undo log that we write only allows us to roll back the entire
statement, we cannot support INSERT IGNORE. We will introduce a
handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage
engines that INSERT IGNORE is being executed.

In many test cases, we add an extra record to the table, so that during
the 'interesting' part of the test, row-level locking and logging will
be used.

Replicas will continue to use row-level logging and locking until
MDEV-24622 has been addressed. Likewise, this optimization will be
disabled in Galera cluster until MDEV-24623 enables it.

dict_table_t::bulk_trx_id: The latest active or committed transaction
that initiated an insert into an empty table or partition.
Protected by exclusive table lock and a clustered index leaf page latch.

ins_node_t::bulk_insert: Whether bulk insert was initiated.

trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert).
Unlike earlier, this collection will cover also temporary tables.

trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(),
is_bulk_insert(), was_bulk_insert().

trx_undo_report_row_operation(): Before accessing any undo log pages,
invoke trx->mod_tables.emplace() in order to determine whether undo
logging was disabled, or whether this is the first INSERT and we are
supposed to write a TRX_UNDO_EMPTY record.

row_ins_clust_index_entry_low(): If we are inserting into an empty
clustered index leaf page, set the ins_node_t::bulk_insert flag for
the subsequent trx_undo_report_row_operation() call.

lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock():
Remove the redundant parameter 'flags' that can be checked in the caller.

btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write
DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation().

trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT),
ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that
the next statement will not be covered by table-level undo logging.

ReadView::changes_visible(trx_id_t) const: New accessor for the case
where the trx_id_t is not read from a potentially corrupted index page
but directly from the memory. In this case, we can skip a sanity check.

row_sel(), row_sel_try_search_shortcut(), row_search_mvcc():
row_sel_try_search_shortcut_for_mysql(),
row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id.

row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees().

lock_sec_rec_cons_read_sees(): Replaced with lower-level code.

btr_root_page_init(): Refactored from btr_create().

dict_index_t::clear(), dict_table_t::clear(): Empty an index or table,
for the ROLLBACK of an INSERT operation.

ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT
into an empty table.

This is joint work with Thirunarayanan Balathandayuthapani,
who created a working prototype.
Thanks to Matthias Leich for extensive testing.
2021-01-25 18:41:27 +02:00
Marko Mäkelä
46234f03c8 Merge 10.5 into 10.6 2021-01-25 12:56:30 +02:00
Marko Mäkelä
961c7938bb Merge 10.4 into 10.5 2021-01-25 12:44:24 +02:00
Marko Mäkelä
3467f63764 Merge 10.3 into 10.4 2021-01-25 11:02:07 +02:00
Sergei Golubchik
29bbcac0ee MDEV-23328 Server hang due to Galera lock conflict resolution
mutex order violation here.
when wsrep bf thread kills a conflicting trx, the stack is

  wsrep_thd_LOCK()
  wsrep_kill_victim()
  lock_rec_other_has_conflicting()
  lock_clust_rec_read_check_and_lock()
  row_search_mvcc()
  ha_innobase::index_read()
  ha_innobase::rnd_pos()
  handler::ha_rnd_pos()
  handler::rnd_pos_by_record()
  handler::ha_rnd_pos_by_record()
  Rows_log_event::find_row()
  Update_rows_log_event::do_exec_row()
  Rows_log_event::do_apply_event()
  Log_event::apply_event()
  wsrep_apply_events()

and mutexes are taken in the order

  lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

  innobase_kill_query()
  kill_handlerton()
  plugin_foreach_with_mask()
  ha_kill_query()
  THD::awake()
  kill_one_thread()

and mutexes are

  victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

To fix the mutex order violation we kill the victim thd asynchronously,
from the manager thread
2021-01-24 11:35:55 +01:00
Sergei Golubchik
5d1db34585 cleanup: void hton::abort_transaction()
and void wsrep_innobase_kill_one_trx()

as their return values are never used.
Also remove redundant cast and checks that are always true
2021-01-24 11:35:55 +01:00
Sergei Golubchik
6a1cb449fe cleanup: remove slave background thread, use handle_manager thread instead 2021-01-24 11:35:55 +01:00