Commit graph

2796 commits

Author SHA1 Message Date
Marko Mäkelä
9f5a3e5689 Merge 10.7 into 10.8 2022-03-15 18:18:07 +02:00
Marko Mäkelä
dc4b7f382b Merge 10.6 into 10.7 2022-03-15 15:25:31 +02:00
Hugo Wen
dafc5fb9c1 MDEV-27342: Fix issue of recovery failure using new server id
Commit 6c39eaeb1 made the crash recovery dependent on server_id.
The crash recovery could fail when restoring a new instance from
original crashed data directory USING A NEW SERVER ID.

The issue doesn't exist in previous major versions before 10.6.

Root cause is when generating the input XID to be searched in the hash,
server id is populated with the current server id.
So if the server id changed when recovering, the XID couldn't be found
in the hash due to server id doesn't match.

This fix is to use original server id when creating the input XID
object in function `xarecover_do_commit_or_rollback`.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2022-03-14 19:57:10 -07:00
Oleksandr Byelkin
4fb2cb1a30 Merge branch '10.7' into 10.8 2022-02-04 14:50:25 +01:00
Oleksandr Byelkin
9ed8deb656 Merge branch '10.6' into 10.7 2022-02-04 14:11:46 +01:00
Oleksandr Byelkin
f5c5f8e41e Merge branch '10.5' into 10.6 2022-02-03 17:01:31 +01:00
Oleksandr Byelkin
cf63eecef4 Merge branch '10.4' into 10.5 2022-02-01 20:33:04 +01:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Sergei Golubchik
d751f42a02 MDEV-27586 Auto-increment does not work with DESC on MERGE table
also fix handler::get_auto_increment()
2022-01-26 18:43:06 +01:00
Andrei
c9356223c9 MDEV-19555 assert Diagnostics_area::sql_errno() in ha_rollback_trans
Fixed the assert to restore pre-refactoring condition for
calling set_error() equivalent.
2022-01-26 18:23:51 +02:00
Aleksey Midenkov
7105c810bd MDEV-27217 typo fix 2022-01-15 14:17:19 +03:00
Aleksey Midenkov
585cb18ed1 MDEV-27452 TIMESTAMP(0) system field is allowed for certain creation of system-versioned table
First, we do not add VERS_UPDATE_UNVERSIONED_FLAG for system field and
that fixes SHOW CREATE result.

Second, we have to call check_sys_fields() for any CREATE TABLE and
there correct type is checked for system fields.

Third, we update system_time like as_row structures for ALTER TABLE
and that makes check_sys_fields() happy for ALTER TABLE when we make
system fields hidden.
2022-01-13 23:35:17 +03:00
Aleksey Midenkov
4d5ae2b325 MDEV-27217 DELETE partition selection doesn't work for history partitions
LIMIT history switching requires the number of history partitions to
be marked for read: from first to last non-empty plus one empty. The
least we can do is to fail with error message if the needed partition
was not marked for read. As this is handler interface we require new
handler error code to display user-friendly error message.

Switching by INTERVAL works out-of-the-box with
ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET error.
2022-01-13 23:35:16 +03:00
Aleksey Midenkov
6d8794e567 MDEV-21650 Non-empty statement transaction on global rollback after TRT update error
TRT opens statement transaction. Cleanup it in case of error.
2022-01-12 23:50:23 +03:00
Marko Mäkelä
7dfaded962 Merge 10.6 into 10.7 2022-01-04 09:55:58 +02:00
Marko Mäkelä
3f5726768f Merge 10.5 into 10.6 2022-01-04 09:26:38 +02:00
Julius Goryavsky
55bb933a88 Merge branch 10.4 into 10.5 2021-12-26 12:51:04 +01:00
Julius Goryavsky
681b7784b6 Merge branch 10.3 into 10.4 2021-12-25 12:13:03 +01:00
Monty
ca2ea4ff41 Only apply wsrep_trx_fragment_size to InnoDB tables
MDEV-22617 Galera node crashes when trying to log to slow_log table in
streaming replication mode

Other things:
- Changed name of wsrep_after_row(two arguments) to
  wsrep_after_row_internal(one argument) to not depended on the
  function signature with unused arguments.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
	     Added test case
2021-12-23 11:51:31 +02:00
Alexander Barkov
a5ef74e7eb MDEV-27195 SIGSEGV in Table_scope_and_contents_source_st::vers_check_system_fields
The old code erroneously used default_charset_info to compare field names.
default_charset_info can point to any arbitrary collation,
including ucs2*, utf16*, utf32*, including those that do not
support strcasecmp().

my_charset_utf8mb4_unicode_ci, which is used in this scenario:

CREATE TABLE t1 ENGINE=InnoDB WITH SYSTEM VERSIONING AS SELECT 0;

does not support strcasecmp().

Fixing the code to use Lex_ident::streq(), which uses
system_charset_info instead of default_charset_info.
2021-12-22 13:12:40 +04:00
Marko Mäkelä
06988bdcaa Merge 10.6 into 10.7 2021-11-09 09:40:29 +02:00
Marko Mäkelä
25ac047baf Merge 10.5 into 10.6 2021-11-09 09:11:50 +02:00
Marko Mäkelä
9c18b96603 Merge 10.4 into 10.5 2021-11-09 08:50:33 +02:00
Marko Mäkelä
47ab793d71 Merge 10.3 into 10.4 2021-11-09 08:40:14 +02:00
Marko Mäkelä
f7054ff5df MariaDB 10.3.32 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmGJZJ0ACgkQ8WVvJMdM
 0dj8Jw//QD4uSbC4EHVdCDXWPQ9K/+Wv2A1DG4kCtngtQAVd/MgOpWK+9gdDCbKE
 Ce6m7627YLLzgBDzkEX/VkciHPd9GqvquqmgVKY1MdQ6efmmwgbzaGcaWcuJF8Z/
 C1pa7j0Duxn6nEuRbvM8OTgN4KfFlAc0OxpraJ7Fr8NvduLZQYMokBBW9DrJT1f1
 zGp4k05wUImBsmBt6teS073FS89frDL4J2aYGTXAxMjiqtno2MCopUIF2rpk5B29
 sJFaDpHCitNYDXuXZvWEWmuuss4vHz/NUYXM/GygfIteJqXKRLEOLAFBfvETyt4q
 6pYZDVfEGdKquHQu1a2XDI3W9+W1inmZ11dtebGnRexJTp9xeSxPhxiUvOQJj84A
 w6cQICCtlDCql3VlOIbt0vvAuXu+rOqhlqHorz0l62o6YjGE92z+NUL7B6gODip9
 RGd0gwCloPo+jGHnfpC6rvfcjA32vEx6L8giYTAYybxqjN1bMNIrix+7zwgfpZPZ
 0QRZtWtio/Iozj41q6x7dmP2Pxjll58+fPUEKevQn2iPm5WoPe+zrq3/lUdXFbZY
 3cz9fZch4YMTlhhu9BwuEmc2T9aIIm/YwYaB0Kmg55J/KT9xyerpMFZmRaF0VWcQ
 70ODJSMEDBhBW3n19LuYK/p3uJr551V/dFbZ/6lCXzbyp5i5MO8=
 =yIEG
 -----END PGP SIGNATURE-----

Merge mariadb-10.3.32 into 10.3
2021-11-09 07:59:36 +02:00
Aleksey Midenkov
c8cece9144 MDEV-26928 Column-inclusive WITH SYSTEM VERSIONING doesn't work with explicit system fields
versioning_fields flag indicates that any columns were specified WITH
SYSTEM VERSIONING. In that case we imply WITH SYSTEM VERSIONING for
the whole table and WITHOUT SYSTEM VERSIONING for the other columns.
2021-11-02 11:49:47 +03:00
sjaakola
ef2dbb8dbc MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 20:40:35 +02:00
Jan Lindström
30337addfc MDEV-25114: Crash: WSREP: invalid state ROLLED_BACK (FATAL)
Revert "MDEV-23328 Server hang due to Galera lock conflict resolution"

This reverts commit 29bbcac0ee.
2021-10-29 10:00:05 +03:00
sjaakola
5c230b21bf MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 09:52:52 +03:00
Sergei Golubchik
9e32f229d4 plugin can signal that it cannot be unloaded by failing deinit()
if plugin->deinit() returns a failure, it is no longer ignored, it
means that the plugin isn't ready to be unloaded from memory yet.
So it's marked "dying", deinitialized as much as possible,
but stays in memory until shutdown.

also:

* increment MARIA_PLUGIN_INTERFACE_VERSION
* rewrite ha_rocksdb to use the new approach, update the test
2021-10-27 15:55:14 +02:00
Aleksey Midenkov
d324c03d0c Vanilla cleanups and refactorings
Dead code cleanup:

part_info->num_parts usage was wrong and working incorrectly in
mysql_drop_partitions() because num_parts is already updated in
prep_alter_part_table(). We don't have to update part_info->partitions
because part_info is destroyed at alter_partition_lock_handling().

Cleanups:

- DBUG_EVALUATE_IF() macro replaced by shorter form DBUG_IF();
- Typo in ER_KEY_COLUMN_DOES_NOT_EXITS.

Refactorings:

- Splitted write_log_replace_delete_frm() into write_log_delete_frm()
  and write_log_replace_frm();
- partition_info via DDL_LOG_STATE;
- set_part_info_exec_log_entry() removed.

DBUG_EVALUATE removed

DBUG_EVALUTATE was only added for consistency together with
DBUG_EVALUATE_IF. It is not used anywhere in the code.

DBUG_SUICIDE() fix on release build

On release DBUG_SUICIDE() was statement. It was wrong as
DBUG_SUICIDE() is used in expression context.
2021-10-26 17:07:46 +02:00
Thirunarayanan Balathandayuthapani
045757af4c MDEV-24621 In bulk insert, pre-sort and build indexes one page at a time
When inserting a number of rows into an empty table,
InnoDB will buffer and pre-sort the records for each index, and
build the indexes one page at a time.

For each index, a buffer of innodb_sort_buffer_size will be created.

If the buffer ran out of memory then we will create temporary files
for storing the data.

At the end of the statement, we will sort and apply the buffered
records. Ideally, we would do this at the end of the transaction
or only when starting to execute a non-INSERT statement on the table.
However, it could be awkward if duplicate keys or similar errors
would be reported during the execution of a later statement.
This will be addressed in MDEV-25036.

Any columns longer than 2000 bytes will buffered in temporary files.

innodb_prepare_commit_versioned(): Apply all bulk buffered insert
operation, at the end of each statement.

ha_commit_trans(): Handle errors from innodb_prepare_commit_versioned().

row_merge_buf_write(): This function should accept blob
file handle too and it should write the field data which are
greater than 2000 bytes

row_merge_bulk_t: Data structure to maintain the data during
bulk insert operation.

trx_mod_table_time_t::start_bulk_insert(): Notify the start of
bulk insert operation and create new buffer for the given table

trx_mod_table_time_t::add_tuple(): Buffer a record.

trx_mod_table_time_t::write_bulk(): Do bulk insert operation
present in the transaction

trx_mod_table_time_t::bulk_buffer_exist(): Whether the buffer
storage exist for the bulk transaction

trx_mod_table_time_t::write_bulk(): Write all buffered insert
operation for the transaction and the table.

row_ins_clust_index_entry_low(): Insert the data into the
bulk buffer if it is already exist.

row_ins_sec_index_entry(): Insert the secondary tuple
if the bulk buffer already exist.

row_merge_bulk_buf_add(): Insert the tuple into bulk buffer
insert operation.

row_merge_buf_blob(): Write the field data whose length is
more than 2000 bytes into blob temporary file. Write the
file offset and length into the tuple field.

row_merge_copy_blob_from_file(): Copy the blob from blob file
handler based on reference of the given tuple.

row_merge_insert_index_tuples(): Handle blob for bulk insert
operation.

row_merge_bulk_t::row_merge_bulk_t(): Constructor. Initialize
the buffer and file for all the indexes expect fts index.

row_merge_bulk_t::create_tmp_file(): Create new temporary file
for the given index.

row_merge_bulk_t::write_to_tmp_file(): Write the content from
buffer to disk file for the given index.

row_merge_bulk_t::add_tuple(): Insert the tuple into the merge
buffer for the given index. If the memory ran out then InnoDB
should sort the buffer and write into file.

row_merge_bulk_t::write_to_index(): Do bulk insert operation
from merge file/merge buffer for the given index

row_merge_bulk_t::write_to_table(): Do bulk insert operation
for all the indexes.

dict_stats_update(): If a bulk insert transaction is in progress,
treat the table as empty. The index creation could hold latches
for extended amounts of time.
2021-10-26 15:01:44 +03:00
Marko Mäkelä
a8379e53e8 Merge 10.5 into 10.6
The changes to galera.galear_var_replicate_myisam_on
in commit d9b933bec6
are omitted due to conflicts
with commit 27d66d644c.
2021-10-13 13:28:12 +03:00
Marko Mäkelä
99bb3fb656 Merge 10.4 into 10.5 2021-10-13 12:33:56 +03:00
Marko Mäkelä
a736a3174a Merge 10.3 into 10.4 2021-10-13 12:03:32 +03:00
Aleksey Midenkov
d31f953789 MDEV-22660 SIGSEGV on adding system versioning and modifying system column
Second alter subcommand correctly removed VERS_ROW_END flag. We throw
ER_VERS_PERIOD_COLUMNS in such case.
2021-10-11 13:36:07 +03:00
Aleksey Midenkov
911c803db1 MDEV-22660 System versioning cleanups
- Cleaned up Vers_parse_info::check_sys_fields();
- Renamed VERS_SYS_START_FLAG, VERS_SYS_END_FLAG to VERS_ROW_START,
  VERS_ROW_END.
2021-10-11 13:36:06 +03:00
Marko Mäkelä
03c09837fc Merge 10.5 into 10.6 2021-09-16 20:17:12 +03:00
Monty
f03fee06b0 Improve error messages from Aria
- Error on commit now returns HA_ERR_COMMIT_ERROR instead of
  HA_ERR_INTERNAL_ERROR
- If checkpoint fails, it will now print out where it failed.
2021-09-15 19:27:34 +03:00
Vladislav Vaintroub
ae85835cc7 Fix warnings from -DPLUGIN_PARTITION=NO, portably.
Also fix Spider's cmake.
2021-09-05 20:00:13 +02:00
Marko Mäkelä
cc4e20e56f Merge 10.5 into 10.6 2021-08-26 10:20:17 +03:00
Marko Mäkelä
87ff4ba7c8 Merge 10.4 into 10.5 2021-08-26 08:46:57 +03:00
Marko Mäkelä
15b691b7bd After-merge fix f84e28c119
In a rebase of the merge, two preceding commits were accidentally reverted:
commit 112b23969a (MDEV-26308)
commit ac2857a5fb (MDEV-25717)

Thanks to Daniele Sciascia for noticing this.
2021-08-25 17:35:44 +03:00
Marko Mäkelä
f84e28c119 Merge 10.3 into 10.4 2021-08-18 16:51:52 +03:00
Leandro Pacheco
112b23969a MDEV-26308 : Galera test failure on galera.galera_split_brain
Contains following fixes:

* allow TOI commands to timeout while trying to acquire TOI with
override lock_wait_timeout with a LONG_TIMEOUT only after
succesfully entering TOI
* only ignore lock_wait_timeout on TOI
* fix galera_split_brain test as TOI operation now returns ER_LOCK_WAIT_TIMEOUT after lock_wait_timeout
* explicitly test for TOI

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-08-18 08:57:33 +03:00
Oleksandr Byelkin
6efb5e9f5e Merge branch '10.5' into 10.6 2021-08-02 10:11:41 +02:00
Oleksandr Byelkin
ae6bdc6769 Merge branch '10.4' into 10.5 2021-07-31 23:19:51 +02:00
mkaruza
eb26e20df5 MDEV-22421 Galera assertion !wsrep_has_changes(thd) || (thd->lex->sql_command == SQLCOM_CREATE_TABLE && !thd->is_current_stmt_binlog_format_row())
Updates to transaction registry table shouldn't be replicated in
cluster so there is no need to append wsrep keys.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-07-28 14:54:18 +03:00
Rucha Deodhar
beb401b25f This commit is a fixup for MDEV-22189.
Changes:
changed mysqld -> mariadbd for some more error messages that were left.
2021-07-26 22:59:10 +05:30
Andrei Elkin
e95f78f475 MDEV-21117 post-push to cover a "custom" xid format
Due to wsrep uses its own xid format for its recovery,
the xid hashing has to be refined.
When a xid object is not in the server "mysql" format,
the hash record made to contain the xid also in the full format.
2021-06-16 23:21:12 +03:00