Commit graph

13954 commits

Author SHA1 Message Date
Maheedhar PV
991d5dce32 Bug#31374305 - FORMAT() NOT DISPLAYING WHOLE NUMBER SIDE CORRECTLY FOR ES_MX AND ES_ES LOCALES
Changed the grouping and decimal separator for spanish locales as per
ICU.

Change-Id: I5d80fa59d3e66372d904e17c22c532d4dd2c565b
2022-01-21 16:02:34 +01:00
Sergei Golubchik
4504e6d14e test cases for MySQL bugs
also fix a comment, and update a macro just in case
2022-01-21 16:02:34 +01:00
Marko Mäkelä
c1d7b4575e MDEV-26870 --skip-symbolic-links does not disallow .isl file creation
The InnoDB DATA DIRECTORY attribute is not implemented via
symbolic links but something similar, *.isl files that contain
the names of data files.

InnoDB failed to ignore the DATA DIRECTORY attribute even though
the server was started with --skip-symbolic-links.

Native ALTER TABLE in InnoDB will retain the DATA DIRECTORY attribute
of the table, no matter if the table will be rebuilt or not.

Generic ALTER TABLE (with ALGORITHM=COPY) as well as TRUNCATE TABLE
will discard the DATA DIRECTORY attribute.

All tests have been run with and without the ./mtr option
--mysqld=--skip-symbolic-links
and some tests that use the InnoDB DATA DIRECTORY attribute
have been adjusted for this.
2022-01-21 14:43:59 +02:00
Thirunarayanan Balathandayuthapani
474c6df804 MDEV-27417 InnoDB spatial index updates change buffer bitmap page
- InnoDB change buffer doesn't support spatial index. Spatial
index should avoid change the buffer bitmap page when the page
split happens.
2022-01-20 12:50:47 +02:00
Daniel Black
410c4edef3 MDEV-27467: innodb to enforce the minimum innodb_buffer_pool_size in SET GLOBAL
.. to be the same as startup.

In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.

The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.

The resulting minimum innodb buffer pool sizes are:

Page Size, Previously minimum (startup), with change.
        4k                            5M           2M
        8k                            5M           3M
       16k                            5M           5M
       32k                           24M          10M
       64k                           24M          20M

With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.

The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.

Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
  minimums. Chunk size also needed increase as the test was for
  pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
  MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
  innodb.restart.opt test so that under debug mode, resizing of the
  innodb buffer pool can occur.
2022-01-19 11:10:45 +11:00
Marko Mäkelä
d7f4fd30f2 MDEV-8851 innodb.innodb_information_schema fails sporadically
The column INFORMATION_SCHEMA.INNODB_LOCKS.LOCK_DATA
would report NULL when the page that contains the locked
record does not reside in the buffer pool.

Pages may be evicted from the buffer pool due to some background
activity, such as the purge of transaction history loading
undo log pages to the buffer pool. The regression tests intentionally
run with a small buffer pool size setting.

To prevent the intermittent test failures, we will filter out the
contents of the LOCK_DATA column from the output.
2022-01-14 15:53:29 +02:00
Aleksey Midenkov
585cb18ed1 MDEV-27452 TIMESTAMP(0) system field is allowed for certain creation of system-versioned table
First, we do not add VERS_UPDATE_UNVERSIONED_FLAG for system field and
that fixes SHOW CREATE result.

Second, we have to call check_sys_fields() for any CREATE TABLE and
there correct type is checked for system fields.

Third, we update system_time like as_row structures for ALTER TABLE
and that makes check_sys_fields() happy for ALTER TABLE when we make
system fields hidden.
2022-01-13 23:35:17 +03:00
Aleksey Midenkov
241ac79e49 MDEV-26778 row_start is not updated in current row for InnoDB
Update was skipped (need_update was false) because compare_record()
used HA_PARTIAL_COLUMN_READ branch and it skipped row_start check
has_explicit_value() was false. When we set bit for row_start in
has_value_set the row is updated with new row_start value.

The bug was caused by combination of MDEV-23446 and 3789692d17. The
latter one says:

  ... But generated columns that are written to the table are always
  deterministic and cannot change unless normal non-generated columns
  were changed. ...

Since MDEV-23446 generated row_start can change while non-generated
columns are not changed.

Explicit value flag came from HAS_EXPLICIT_DEFAULT which was used to
distinguish default-generated value from user-supplied one.
2022-01-13 23:35:17 +03:00
Aleksey Midenkov
4d5ae2b325 MDEV-27217 DELETE partition selection doesn't work for history partitions
LIMIT history switching requires the number of history partitions to
be marked for read: from first to last non-empty plus one empty. The
least we can do is to fail with error message if the needed partition
was not marked for read. As this is handler interface we require new
handler error code to display user-friendly error message.

Switching by INTERVAL works out-of-the-box with
ER_ROW_DOES_NOT_MATCH_GIVEN_PARTITION_SET error.
2022-01-13 23:35:16 +03:00
Aleksey Midenkov
f9f6b190cc Versioning test suite cleanups
Merged truncate_privilege and sysvars-notembedded into not_embedded.test
Merged partition_innodb into trx_id.test
2022-01-13 23:35:16 +03:00
Jan Lindström
a38b937bf1 MDEV-25201 : Assertion `thd->wsrep_trx_meta.gtid.seqno == (-1)' failed in int wsrep_to_isolation_begin(THD*, const char*, const char*, const TABLE_LIST*, Alter_info*)
Test case does not assert anymore but works incorrectly. We should
not replicate PREPARE using TOI.
2022-01-11 09:43:59 +02:00
Jan Lindström
e32c21cb93 Changing wsrep_slave_threads parameter requires that cluster
is connected so moved test here.
2022-01-11 09:43:59 +02:00
Jan Lindström
c430f612eb MDEV-25856 : SIGSEGV in ha_myisammrg::append_create_info
For MERGE-tables we need to init children list before calling
show_create_table and then detach children before we continue
normal mysql_create_like_table execution.
2022-01-11 09:43:59 +02:00
Brandon Nesterenko
96de6bfd5e MDEV-16091: Seconds_Behind_Master spikes to millions of seconds
Problem:
========
A slave’s relay log format description event is used when
calculating Seconds_Behind_Master (SBM). This forces the SBM
value to spike when processing these events, as their creation
date is set to the timestamp that the IO thread begins.

Solution:
========
When the slave generates a format description event, mark the
event as a relay log event so it does not update the
rli->last_master_timestamp variable.

Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
2022-01-04 11:21:33 -07:00
Andrei
80da35a326 MDEV-27365 CREATE-or-REPLACE SEQUENCE is binlogged without DDL flag
CREATE-OR-REPLACE SEQUENCE is not logged with Gtid event DDL flag
which affects its slave parallel execution.
Unlike other DDL:s it can occur in concurrent execution with following transactions
which can lead to various errors, including asserts like

(mdl_request->type != MDL_INTENTION_EXCLUSIVE && mdl_request->type != MDL_EXCLUSIVE) || !(get_thd()->rgi_slave && get_thd()->rgi_slave->is_parallel_exec && lock->check_if_conflicting_replication_locks(this)

in MDL_context::acquire_lock.

Fixed to wrap internal statement level commit with save-
and-restore of TRANS_THD::m_unsafe_rollback_flags.
2022-01-03 17:39:23 +02:00
Rucha Deodhar
fad1d15326 MDEV-25460: Assertion `!is_set() || (m_status == DA_OK_BULK && is_bulk_op())'
failed in Diagnostics_area::set_ok_status in my_ok from
mysql_sql_stmt_prepare

Analysis: Before PREPARE is executed, binlog_format is STATEMENT.
This PREPARE had SET STATEMENT which sets binlog_format to ROW. Now after
PREPARE is done we reset the binlog_format (back to STATEMENT). But we have
temporary table, it doesn't let changing binlog_format=ROW to
binlog_format=STATEMENT and gives error which goes unreported. This
unreported error eventually causes assertion failure.
Fix: Change return type for LEX::restore_set_statement_var() to bool and
make it return error state.
2021-12-28 16:59:29 +05:30
Julius Goryavsky
97695675c5 Merge branch 10.2 into 10.3 2021-12-24 04:17:55 +01:00
Julius Goryavsky
b5cbe50604 MDEV-24097: galera[_3nodes] suite tests in MTR sporadically fails
This is the first part of the fixes for MDEV-24097. This commit
contains the fixes for instability when testing Galera and when
restarting nodes quickly:

1) Protection against a "stuck" old SST process during the execution
   of the new SST (after restarting the node) is now implemented for
   mariabackup / xtrabackup, which should help to avoid almost all
   conflicts due to the use of the same ports - both during testing
   with mtr, so and when restarting nodes quickly in a production
   environment.
2) Added more protection to scripts against unexpected return of
   the rc != 0 (in the commands for deleting temporary files, etc).
3) Added protection against unexpected crashes during binlog transfer
   (in SST scripts for rsync).
4) Spaces and some special characters in binlog filenames shouldn't
   be a problem now (at the script level).
5) Daemon process termination tracking has been made more robust
   against crashes due to unexpected termination of the previous SST
   process while new scripts are running.
6) Reading ssl encryption parameters has been moved from specific
   SST scripts to a common wsrep_sst_common.sh script, which allows
   unified error handling, unified diagnostics and simplifies script
   revisions in the future.
7) Improved diagnostics of errors related to the use of openssl.
8) Corrections have been made for xtrabackup-v2 (both in tests and in
   the script code) that restore the work of xtrabackup with updated
   versions of innodb.
9) Fixed some tests for galera_3nodes, although the complete solution
   for the problem of starting three nodes at the same time on fast
   machines will be done in a separate commit.

No additional tests are required as this commit fixes problems with
existing tests.
2021-12-23 14:19:44 +01:00
Julius Goryavsky
3376668ca8 Merge branch 10.2 into 10.3 2021-12-23 14:14:04 +01:00
Marko Mäkelä
3b33593f80 MDEV-27332 SIGSEGV in fetch_data_into_cache()
Since commit fb335b48b5 we may have
a null pointer in purge_sys.query when fetch_data_into_cache() is
invoked and innodb_force_recovery>4. This is because the call to
purge_sys.create() would be skipped.

fetch_data_into_cache(): Load the purge_sys pseudo transaction pointer
to a local variable (null pointer if purge_sys is not initialized).
2021-12-21 11:07:25 +02:00
Aleksey Midenkov
3fd80d0874 MDEV-27244 Table corruption upon adding serial data type
MDEV-25803 excluded some cases from key sort upon alter table. That
particularly depends on ALTER_ADD_INDEX flag. Creating a column of
SERIAL data type missed that flag. Though equivalent operation

  alter table t1 add x bigint unsigned not null auto_increment unique;

has ALTER_ADD_INDEX flag.
2021-12-16 23:13:45 +03:00
Jan Lindström
f1ca949f2b Disable following tests from galera_3nodes suite
* galera_pc_bootstrap
* galera_ipv6_mariabackup
* galera_ipv6_mariabackup_section
* galera_ipv6_rsync
* galera_ipv6_rsync_section
* galera_ssl_reload
* galera_toi_vote
* galera_wsrep_schema_init

because MTR sporadaically fails: Failed to start mysqld or mysql_shutdown failed
2021-12-15 16:58:27 +02:00
Marko Mäkelä
ef9517eb81 MDEV-27268 Failed InnoDB initialization leaves garbage files behind
create_log_files(): Check log_set_capacity() before modifying
or creating any log files.

innobase_start_or_create_for_mysql(): If create_log_files()
fails and we were initializing a new database, delete the
system tablespace files before exiting.
2021-12-15 14:17:55 +02:00
Julius Goryavsky
7bc629a5ce MDEV-27181: Galera SST scripts should use ssl_capath for CA directory
1. Galera SST scripts should use ssl_capath (not ssl_ca) for CA
   directory. The current implementation tries to automatically
   detect the path using the trailing slash in the ssl_ca variable
   value, but this approach is not compatible with the server
   configuration. Now, by analogy with the server, SST scripts
   also use a separate ssl_capath variable. In addition, a similar
   tcapath variable has been added for the old-style configuration
   (in the "sst" section).
2. Openssl utility detection made more reliable.
3. Removed extra spaces in automatically generated command lines -
   to simplify debugging of the SST scripts.
4. In general, the code for detecting the presence or absence of
   auxiliary utilities has been improved - it is made more reliable
   in some configurations (and for shells other than bash).
2021-12-14 03:32:35 +01:00
Julius Goryavsky
8bb5563369 MDEV-27181: Galera SST scripts should use ssl_capath for CA directory
1. Galera SST scripts should use ssl_capath (not ssl_ca) for CA
   directory. The current implementation tries to automatically
   detect the path using the trailing slash in the ssl_ca variable
   value, but this approach is not compatible with the server
   configuration. Now, by analogy with the server, SST scripts
   also use a separate ssl_capath variable. In addition, a similar
   tcapath variable has been added for the old-style configuration
   (in the "sst" section).
2. Openssl utility detection made more reliable.
3. Removed extra spaces in automatically generated command lines -
   to simplify debugging of the SST scripts.
4. In general, the code for detecting the presence or absence of
   auxiliary utilities has been improved - it is made more reliable
   in some configurations (and for shells other than bash).
2021-12-14 03:25:19 +01:00
Marko Mäkelä
6b066ec332 MDEV-27235: Crash on SET GLOBAL innodb_encrypt_tables
fil_crypt_set_encrypt_tables(): If no encryption threads have been
initialized, do nothing.
2021-12-13 08:04:45 +02:00
forkfun
eafa2a1411 enable partition_open_files_limit test 2021-12-09 16:29:22 +01:00
forkfun
375ae890c7 enable rpl_semi_sync_after_sync and rpl_semi_sync_slave_compressed_protocol tests 2021-12-07 15:25:43 +01:00
Marko Mäkelä
289721de9a Merge 10.2 into 10.3 2021-11-29 10:33:06 +02:00
Julius Goryavsky
2f51511c08 MDEV-26915: SST scripts do not take log_bin_index setting into account
Currently, SST scripts assume that the filename specified in
the --log-bin-index argument either does not contain an extension
or uses the standard ".index" extension. Similar assumptions are
used for the log_bin_index parameter read from the configuration
file. This commit adds support for arbitrary extensions for the
index file paths.
2021-11-23 03:10:47 +01:00
Julius Goryavsky
b952599786 MDEV-26064: mariabackup SST fails when starting with --innodb-force-recovery
If the server is started with the --innodb-force-recovery argument
on the command line, then during SST this argument can be passed to
mariabackup only at the --prepare stage, and accordingly it must be
removed from the --mysqld-args list (and it is not should be passed
to mariabackup otherwise).

This commit fixes a flaw in the SST scripts and add a test that
checks the ability to run the joiner node in a configuration that
uses --innodb-force-recovery=1.
2021-11-23 03:10:47 +01:00
Marko Mäkelä
9962cda527 Merge 10.2 into 10.3 2021-11-17 13:55:54 +02:00
Eugene Kosov
ed0a224b3d MDEV-26747 improve corruption check for encrypted tables on ALTER IMPORT
fil_space_decrypt(): change signature to return status via dberr_t only.
Also replace impossible condition with an assertion and prove it via
test cases.
2021-11-17 15:49:22 +06:00
Marko Mäkelä
524b4a89da Merge 10.2 into 10.3 2021-11-09 08:26:59 +02:00
Marko Mäkelä
8c7e551da1 Remove restarts from encrypt_and_grep test 2021-11-09 08:08:29 +02:00
Marko Mäkelä
75f0c595d9 MariaDB 10.2.41 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmGJTwkACgkQ8WVvJMdM
 0dgWYA/9EzjoEEB+b0iwhhnw8VRrsIEsLo9Cf0yhAg2dfFqKpzHGP9DNxsnb/pdS
 gbNKODrzSqRFdkO79ThfRa2FwIEOlAGJIbV5njRPqAsRoZ3qQd20RqJ+gbr8/PYn
 Xf0KZz82FcBePZjzcrbCUkqhrfnCtFsbg41YzYFZT6ETDtXOvusd4/eTZr+lhptk
 dfxItFsaJMhi1RxNFQlj+u7rFGpLbXtgfGEoQXj0CjtVvV2tyBPLP4siuaUXWQ88
 XfH7ZFHL/0LxEVNO4QGzp2yc6N4ePXYVLGqDjn8HxquG0YrZ37Z+G++nNudrC+K8
 +THgKihP162lMS770TL+4WLLBDWpIE01Fsf9GhxJclK2oIpxxdiw4rCPNwL4LoGw
 0N6yQdYN50CxFFzBOFOid0fp401G4w0FxkwDRhRcN895vSFMpZ60QLBv6MOcLH9D
 OqFKYT29bG4zr7mzV6uNXXrdQ5q9VFeU8coUV4MLQymatxlVOpOLYnEY2qQ8AcN3
 EwnVaacoo1ZrmBnG56H3TNrUQSpFXtXRmDgR3wwWB9CcqeWU/ImWYbETgObhvSSG
 O0QzLtAgSMsdfRWDxjPgi8di3t7k9Yi2kZUAs8nQFQNwFbGJ5O1LlJrnpfbJcngi
 GR2t8Rbvm1hk0AJIIAWg2T48Dc/OOWUtXjbL+HszdfqsuwFvWT4=
 =MSGM
 -----END PGP SIGNATURE-----

Merge mariadb-10.2.41 into 10.2
2021-11-09 08:07:58 +02:00
Marko Mäkelä
f7054ff5df MariaDB 10.3.32 release
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEF39AEP5WyjM2MAMF8WVvJMdM0dgFAmGJZJ0ACgkQ8WVvJMdM
 0dj8Jw//QD4uSbC4EHVdCDXWPQ9K/+Wv2A1DG4kCtngtQAVd/MgOpWK+9gdDCbKE
 Ce6m7627YLLzgBDzkEX/VkciHPd9GqvquqmgVKY1MdQ6efmmwgbzaGcaWcuJF8Z/
 C1pa7j0Duxn6nEuRbvM8OTgN4KfFlAc0OxpraJ7Fr8NvduLZQYMokBBW9DrJT1f1
 zGp4k05wUImBsmBt6teS073FS89frDL4J2aYGTXAxMjiqtno2MCopUIF2rpk5B29
 sJFaDpHCitNYDXuXZvWEWmuuss4vHz/NUYXM/GygfIteJqXKRLEOLAFBfvETyt4q
 6pYZDVfEGdKquHQu1a2XDI3W9+W1inmZ11dtebGnRexJTp9xeSxPhxiUvOQJj84A
 w6cQICCtlDCql3VlOIbt0vvAuXu+rOqhlqHorz0l62o6YjGE92z+NUL7B6gODip9
 RGd0gwCloPo+jGHnfpC6rvfcjA32vEx6L8giYTAYybxqjN1bMNIrix+7zwgfpZPZ
 0QRZtWtio/Iozj41q6x7dmP2Pxjll58+fPUEKevQn2iPm5WoPe+zrq3/lUdXFbZY
 3cz9fZch4YMTlhhu9BwuEmc2T9aIIm/YwYaB0Kmg55J/KT9xyerpMFZmRaF0VWcQ
 70ODJSMEDBhBW3n19LuYK/p3uJr551V/dFbZ/6lCXzbyp5i5MO8=
 =yIEG
 -----END PGP SIGNATURE-----

Merge mariadb-10.3.32 into 10.3
2021-11-09 07:59:36 +02:00
Oleksandr Byelkin
a2f147af35 Merge branch '10.2' into 10.3 2021-11-05 19:58:32 +01:00
Andrei Elkin
561b6c7e51 MDEV-26833 Missed statement rollback in case transaction drops or create temporary table
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.

Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.
2021-11-05 19:33:28 +02:00
Aleksey Midenkov
c8cece9144 MDEV-26928 Column-inclusive WITH SYSTEM VERSIONING doesn't work with explicit system fields
versioning_fields flag indicates that any columns were specified WITH
SYSTEM VERSIONING. In that case we imply WITH SYSTEM VERSIONING for
the whole table and WITHOUT SYSTEM VERSIONING for the other columns.
2021-11-02 11:49:47 +03:00
Aleksey Midenkov
8ce5635a3e MDEV-22284 Aria table key read crash because wrong index used
When restoring lastinx last_key.keyinfo must be updated as well. The
good example is in _ma_check_index().

The point of failure is extra(HA_EXTRA_NO_KEYREAD) in
ha_maria::get_auto_increment():

  1. extra(HA_EXTRA_KEYREAD) saves lastinx;
  2. maria_rkey() changes index, so the lastinx and last_key.keyinfo;
  3. extra(HA_EXTRA_NO_KEYREAD) restores lastinx but not
     last_key.keyinfo.

So we have discrepancy between lastinx and last_key.keyinfo after 3.
2021-11-02 11:26:35 +03:00
Jan Lindström
db64924454 MDEV-23328 Server hang due to Galera lock conflict resolution
* Fix error handling NULL-pointer reference
* Add mtr-suppression on galera_ssl_upgrade
2021-11-02 07:23:40 +02:00
Jan Lindström
e571eaae9f MDEV-23328 Server hang due to Galera lock conflict resolution
Use better error message when KILL fails even in case TOI
fails.
2021-11-02 07:20:30 +02:00
Aleksey Midenkov
1be39f86cc MDEV-25552 system versioned partitioned by LIMIT tables break CHECK TABLE
Replaced HA_ADMIN_NOT_IMPLEMENTED error code by HA_ADMIN_OK. Now CHECK
TABLE does not fail by unsupported check_misplaced_rows(). Admin
message is not needed as well.

Test case is the same as for MDEV-21011 (a7cf0db3d8), the result have
been changed.
2021-11-02 04:52:03 +03:00
Jan Lindström
ea239034de MDEV-23328 Server hang due to Galera lock conflict resolution
* Fix error handling NULL-pointer reference
* Add mtr-suppression on galera_ssl_upgrade
2021-11-01 13:07:55 +02:00
Alexander Barkov
059797ed44 MDEV-24901 SIGSEGV in fts_get_table_name, SIGSEGV in ib_vector_size, SIGSEGV in row_merge_fts_doc_tokenize, stack smashing
strmake() puts one extra 0x00 byte at the end of the string.
The code in my_strnxfrm_tis620[_nopad] did not take this into
account, so in the reported scenario the 0x00 byte was put outside
of a stack variable, which made ASAN crash.

This problem is already fixed in in MySQL:

  commit 19bd66fe43c41f0bde5f36bc6b455a46693069fb
  Author: bin.x.su@oracle.com <>
  Date:   Fri Apr 4 11:35:27 2014 +0800

But the fix does not seem to be correct, as it breaks when finds a zero byte
in the source string.

Using memcpy() instead of strmake().

- Unlike strmake(), memcpy() it does not write beyond the destination
  size passed.
- Unlike the MySQL fix, memcpy() does not break on the first 0x00 byte found
  in the source string.
2021-10-29 12:37:29 +04:00
sjaakola
157b3a637f MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 10:00:17 +03:00
sjaakola
db50ea3ad3 MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 07:57:18 +03:00
Andrei Elkin
42ae765960 MDEV-26833 Missed statement rollback in case transaction drops or create temporary table
When transaction creates or drops temporary tables and afterward its statement
faces an error even the transactional table statement's cached ROW
format events get involved into binlog and are visible after the transaction's commit.

Fixed with proper analysis of whether the errored-out statement needs
to be rolled back in binlog.
For instance a fact of already cached CREATE or DROP for temporary
tables by previous statements alone
does not cause to retain the being errored-out statement events in the
cache.
Conversely, if the statement creates or drops a temporary table
itself it can't be rolled back - this rule remains.
2021-10-28 19:54:03 +03:00
Alice Sherepa
4e5cf34819 rpl_get_master_version_and_clock and rpl_row_big_table_id tests are slow, so let's not run them under valgrind 2021-10-28 11:15:54 +02:00