Commit graph

544 commits

Author SHA1 Message Date
Marko Mäkelä
bb9f010432 Merge 11.4 into 11.8 2025-03-05 20:39:47 +02:00
Marko Mäkelä
49a6baec56 Merge 10.11 into 11.4 2025-03-03 11:07:56 +02:00
Marko Mäkelä
6e6a1b316c MDEV-35000: dict_table_close() breaks STATS_AUTO_RECALC
stats_deinit(): Replaces dict_stats_deinit().
Deinitialize the statistics for persistent tables,
so that they will be reloaded or recalculated
on a subsequent ha_innobase::open().

ha_innobase::rename_table(): Invoke stats_deinit() so that the
subsequent ha_innobase::open() will reload the InnoDB persistent
statistics. That is, it will remain possible to have the InnoDB
persistent statistics reloaded by executing the following:
RENAME TABLE t TO tmp, tmp TO t;

dict_table_close(table): Replaced with table->release().
There will no longer be any logic that would attempt to ensure
that the InnoDB persistent statistics will be reloaded after
FLUSH TABLES has been executed. This also fixes the problem that
dict_table_t::stat_modified_counter would be frequently reset to 0,
whenever ha_innobase::open() is invoked after the table reference
count had dropped to 0.

dict_table_close(table, thd, mdl): Remove the parameter "dict_locked".
Do not try to invalidate the statistics.

ha_innobase::statistics_init(): Replaces dict_stats_init(table).

Reviewed by: Thirunarayanan Balathandayuthapani
2025-02-28 09:00:16 +02:00
Marko Mäkelä
1ed09cfdcb MDEV-35000 preparation: Clean up dict_table_t::stat
innodb_stats_transient_sample_pages, innodb_stats_persistent_sample_pages:
Change the type to UNSIGNED, because the number of pages in a table
is limited to 32 bits by the InnoDB file format.

btr_get_size_and_reserved(), fseg_get_n_frag_pages(),
fseg_n_reserved_pages_low(), fseg_n_reserved_pages(): Return uint32_t.
The file format limits page numbers to 32 bits.

dict_table_t::stat: An Atomic_relaxed<uint32_t> that combines a
number of metadata fields.

innodb_copy_stat_flags(): Copy the statistics flags from
TABLE_SHARE or HA_CREATE_INFO.

dict_table_t::stats_initialized(), dict_table_t::stats_is_persistent():
Accessors to dict_table_t::stat.

Reviewed by: Thirunarayanan Balathandayuthapani
2025-02-28 08:55:16 +02:00
ParadoxV5
2047483417 Tag my_printf_error with ATTRIBUTE_FORMAT
[Breaking]
The `my_print_error` service passes formats and args directly
to `my_vsnprintf`. Just like the `my_snprintf` service,
I increased this service’s major version because:
* Custom suffixes are now a thing
  (and custom specifiers will soon no longer be).
* GCC `-Wformat` now checks formats sent to them.
2025-02-11 20:32:55 +01:00
Sergei Golubchik
9ee09a33bb Merge branch '11.7' into 11.8 2025-02-11 20:29:43 +01:00
Sergei Golubchik
ba01c2aaf0 Merge branch '11.4' into 11.7
* rpl.rpl_system_versioning_partitions updated for MDEV-32188
* innodb.row_size_error_log_warnings_3 changed error for MDEV-33658
  (checks are done in a different order)
2025-02-06 16:46:36 +01:00
Sergei Golubchik
7d657fda64 Merge branch '10.11 into 11.4 2025-01-30 12:01:11 +01:00
Sergei Golubchik
e69f8cae1a Merge branch '10.6' into 10.11 2025-01-30 11:55:13 +01:00
Marko Mäkelä
d4da659b43 MDEV-35854: Simplify dict_get_referenced_table()
innodb_convert_name(): Convert a schema or table name to
my_charset_filename compatible format.

dict_table_lookup(): Replaces dict_get_referenced_table().
Make the callers responsible for invoking innodb_convert_name().

innobase_casedn_str(): Remove. Let us invoke my_casedn_str() directly.

dict_table_rename_in_cache(): Do not duplicate a call to
dict_mem_foreign_table_name_lookup_set().

innobase_convert_to_filename_charset(): Defined static in the only
compilation unit that needs it.

dict_scan_id(): Remove the constant parameters
table_id=FALSE, accept_also_dot=TRUE. Invoke strconvert() directly.

innobase_convert_from_id(): Remove; only called from dict_scan_id().

innobase_convert_from_table_id(): Remove (dead code).

table_name_t::dblen(), table_name_t::basename(): In non-debug builds,
tolerate names that may miss a '/' separator.

Reviewed by: Debarun Banerjee
2025-01-23 14:38:08 +02:00
Marko Mäkelä
4dcb1b575b MDEV-35049: Use CRC-32C and avoid allocating heap
For the adaptive hash index, dtuple_fold() and rec_fold() were employing
a slow rolling hash algorithm, computing hash values ("fold") for one
field and one byte at a time, while depending on calls to
rec_get_offsets().

We already have optimized implementations of CRC-32C and have been
successfully using that function in some other InnoDB tables, but not
yet in the adaptive hash index.

Any linear function such as any CRC will fail the avalanche test that
any cryptographically secure hash function is expected to pass:
any single-bit change in the input key should affect on average half
the bits in the output.

But we always were happy with less than cryptographically secure:
in fact, ut_fold_ulint_pair() or ut_fold_binary() are just about as
linear as any CRC, using a combination of multiplication and addition,
partly carry-less. It is worth noting that exclusive-or corresponds to
carry-less subtraction or addition in a binary Galois field, or GF(2).

We only need some way of reducing key prefixes into hash values.
The CRC-32C should be better than a Rabin–Karp rolling hash algorithm.
Compared to the old hash algorithm, it has the drawback that there will
be only 32 bits of entropy before we choose the hash table cell by a
modulus operation. The size of each adaptive hash index array is
(innodb_buffer_pool_size / 512) / innodb_adaptive_hash_index_parts.
With the maximum number of partitions (512), we would not exceed 1<<32
elements per array until the buffer pool size exceeds 1<<50 bytes (1 PiB).
We would hit other limits before that: the virtual address space on many
contemporary 64-bit processor implementations is only 48 bits (256 TiB).
So, we can simply go for the SIMD accelerated CRC-32C.

rec_fold(): Take a combined parameter n_bytes_fields. Determine the
length of each field on the fly, and compute CRC-32C over a single
contiguous range of bytes, from the start of the record payload area
to the end of the last full or partial field. For secondary index records
in ROW_FORMAT=REDUNDANT, also the data area that is reserved for NULL
values (to facilitate in-place updates between NULL and NOT NULL values)
will be included in the count. Luckily, InnoDB always zero-initialized
such unused area; refer to data_write_sql_null() in
rec_convert_dtuple_to_rec_old(). For other than ROW_FORMAT=REDUNDANT,
no space is allocated for NULL values, and therefore the CRC-32C will
only cover the actual payload of the key prefix.

dtuple_fold(): For ROW_FORMAT=REDUNDANT, include the dummy NULL values
in the CRC-32C, so that the values will be comparable with rec_fold().

innodb_ahi-t: A unit test for rec_fold() and dtuple_fold().

btr_search_build_page_hash_index(), btr_search_drop_page_hash_index():
Use a fixed-size stack buffer for computing the fold values, to avoid
dynamic memory allocation.

btr_search_drop_page_hash_index(): Do not release part.latch if we
need to invoke multiple batches of rec_fold().

dtuple_t: Allocate fewer bits for the fields. The maximum number of
data fields is about 1023, so uint16_t will be fine for them. The
info_bits is stored in less than 1 byte.

ut_pair_min(), ut_pair_cmp(): Remove. We can actually combine and compare
int(n_fields << 16 | n_bytes).

PAGE_CUR_LE_OR_EXTENDS, PAGE_CUR_DBG: Remove. These were never defined,
because they would only work with latin1_swedish_ci if at all.

btr_cur_t::check_mismatch(): Replaces !btr_search_check_guess().

cmp_dtuple_rec_bytes(): Replaces cmp_dtuple_rec_with_match_bytes().
Determine the offsets of fields on the fly.

page_cur_try_search_shortcut_bytes(): This caller of
cmp_dtuple_rec_bytes() will not be invoked on the change buffer tree.

cmp_dtuple_rec_leaf(): Replaces cmp_dtuple_rec_with_match()
for comparing leaf-page records.

buf_block_t::ahi_left_bytes_fields: Consolidated Atomic_relaxed<uint32_t>
of curr_left_side << 31 | curr_n_bytes << 16 | curr_n_fields.
The other set of parameters (n_fields, n_bytes, left_side) was removed
as redundant.

btr_search_update_hash_node_on_insert(): Merged to
btr_search_update_hash_on_insert().

btr_search_build_page_hash_index(): Take combined left_bytes_fields
instead of n_fields, n_bytes, left_side.

btr_search_update_block_hash_info(), btr_search_update_hash_ref():
Merged to btr_search_info_update_hash().

btr_cur_t::n_bytes_fields: Replaces n_bytes << 16 | n_fields.

We also remove many redundant checks of btr_search.enabled.
If we are holding any btr_sea::partition::latch, then a nonnull pointer
in buf_block_t::index must imply that the adaptive hash index is enabled.

Reviewed by: Vladislav Lesin
2025-01-10 16:39:44 +02:00
Marko Mäkelä
9c8bdc6c15 MDEV-35049: btr_search_check_free_space_in_heap() is a bottleneck
Let us use implement a simple fixed-size allocator for the adaptive hash
index, insted of complicating mem_heap_t or mem_block_info_t.

MEM_HEAP_BTR_SEARCH: Remove.

mem_block_info_t::free_block(), mem_heap_free_block_free(): Remove.

mem_heap_free_top(), mem_heap_get_top(): Remove.

btr_sea::partition::spare: Replaces mem_block_info_t::free_block.
This keeps one spare block per adaptive hash index partition, to
process an insert.

We must not wait for buf_pool.mutex while holding
any btr_sea::partition::latch. That is why we cache one block for
future allocations. This is protected by a new
btr_sea::partition::blocks_mutex in order to relieve pressure on
btr_sea::partition::latch.

btr_sea::partition::prepare_insert(): Replaces
btr_search_check_free_space_in_heap().

btr_sea::partition::erase(): Replaces ha_search_and_delete_if_found().

btr_sea::partition::cleanup_after_erase(): Replaces the most part of
ha_delete_hash_node(). Unlike the previous implementation, we will
retain a spare block for prepare_insert().
This should reduce some contention on buf_pool.mutex.

btr_search.n_parts: Replaces btr_ahi_parts.

btr_search.enabled: Replaces btr_search_enabled. This must hold
whenever buf_block_t::index is set while a thread is holding a
btr_sea::partition::latch.

dict_index_t::search_info: Remove pointer indirection, and use
Atomic_relaxed or Atomic_counter for most fields.

btr_search_guess_on_hash(): Let the caller ensure that latch_mode is
BTR_MODIFY_LEAF or BTR_SEARCH_LEAF. Release btr_sea::partition::latch
before buffer-fixing the block. The page latch that we already acquired
is preventing buffer pool eviction. We must validate both
block->index and block->page.state while holding part.latch
in order to avoid race conditions with buffer page relocation
or buf_pool_t::resize().

btr_search_check_guess(): Remove the constant parameter
can_only_compare_to_cursor_rec=false.

ahi_node: Replaces ha_node_t.

This has been tested by running the regression test suite
with the adaptive hash index enabled:
./mtr --mysqld=--loose-innodb-adaptive-hash-index=ON

Reviewed by: Vladislav Lesin
2025-01-10 16:30:42 +02:00
Marko Mäkelä
33907f9ec6 Merge 11.4 into 11.7 2024-12-02 17:51:17 +02:00
Marko Mäkelä
2719cc4925 Merge 10.11 into 11.4 2024-12-02 11:35:34 +02:00
Marko Mäkelä
3d23adb766 Merge 10.6 into 10.11 2024-11-29 13:43:17 +02:00
Marko Mäkelä
3c312d247c MDEV-35190 HASH_SEARCH duplicates effort before HASH_INSERT or HASH_DELETE
The HASH_ macros are unnecessarily obfuscating the logic,
so we had better replace them.

hash_cell_t::search(): Implement most of the HASH_DELETE logic,
for a subsequent insert or remove().

hash_cell_t::remove(): Remove an element.

hash_cell_t::find(): Implement the HASH_SEARCH logic.

xb_filter_hash_free(): Avoid any hash table lookup;
just traverse the hash bucket chains and free each element.

xb_register_filter_entry(): Search databases_hash only once.

rm_if_not_found(): Make use of find_filter_in_hashtable().

dict_sys_t::acquire_temporary_table(), dict_sys_t::find_table():
Define non-inline to avoid unnecessary code duplication.

dict_sys_t::add(dict_table_t *table), dict_table_rename_in_cache():
Look for duplicate while finding the insert position.

dict_table_change_id_in_cache(): Merged to the only caller
row_discard_tablespace().

hash_insert(): Helper function of dict_sys_t::resize().

fil_space_t::create(): Look for a duplicate (and crash if found)
when searching for the insert position.

lock_rec_discard(): Take the hash array cell as a parameter
to avoid a duplicated lookup.

lock_rec_free_all_from_discard_page(): Remove a parameter.

Reviewed by: Debarun Banerjee
2024-11-21 08:59:02 +02:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Yuchen Pei
ba7088d462
Merge '11.4' into 11.6 2024-10-03 15:59:20 +10:00
Max Kellermann
6715e4dfe1 MDEV-34973: innobase/dict0dict: add noexcept to lock/unlock methods
Another chance for cutting back overhead due to C++ exceptions being
enabled; the `dict_sys_t` class is a good candidate because its
locking methods are called frequently.

Binary size reduction this time:

    text	  data	   bss	   dec	   hex	filename
 24448622	2436488	9473537	36358647	22ac9f7	build/release/sql/mariadbd
 24448474	2436488	9473601	36358563	22ac9a3	build/release/sql/mariadbd
2024-10-01 09:53:16 +03:00
Thirunarayanan Balathandayuthapani
cc810e64d4 MDEV-34392 Inplace algorithm violates the foreign key constraint
Don't allow the referencing key column from NULL TO NOT NULL
when

 1) Foreign key constraint type is ON UPDATE SET NULL
 2) Foreign key constraint type is ON DELETE SET NULL
 3) Foreign key constraint type is UPDATE CASCADE and referenced
 column declared as NULL

Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values

get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.

fk_check_column_changes(): Enforce the above rules for COPY
algorithm

innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation

innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm

dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
2024-10-01 09:41:56 +05:30
Yuchen Pei
cfa9784edb
Merge branch '10.11' into 11.2 2024-09-18 10:25:16 +10:00
Yuchen Pei
b168859d1e
Merge branch '10.6' into 10.11 2024-09-11 16:10:53 +10:00
Marko Mäkelä
b7b2d2bde4 Merge 10.5 into 10.6 2024-09-09 11:30:30 +03:00
Marko Mäkelä
f06060f5ed Cleanup: Remove the function dict_remove_db_name() 2024-09-06 14:31:55 +03:00
Marko Mäkelä
024a18dbcb MDEV-34823 Invalid arguments in ib_push_warning()
In the bug report MDEV-32817 it occurred that the function
row_mysql_get_table_status() is outputting a fil_space_t*
as if it were a numeric tablespace identifier.

ib_push_warning(): Remove. Let us invoke push_warning_printf() directly.

innodb_decryption_failed(): Report a decryption failure and set the
dict_table_t::file_unreadable flag. This code was being duplicated in
very many places. We return the constant value DB_DECRYPTION_FAILED
in order to avoid code duplication in the callers and to allow tail calls.

innodb_fk_error(): Report a FOREIGN KEY error.

dict_foreign_def_get(), dict_foreign_def_get_fields(): Remove.
This code was being used in dict_create_add_foreign_to_dictionary()
in an apparently uncovered code path. That ib_push_warning() call
would pass the integer i+1 instead of a pointer to NUL terminated
string ("%s"), and therefore the call should have resulted in a crash.

dict_print_info_on_foreign_key_in_create_format(),
innobase_quote_identifier(): Add const qualifiers.

row_mysql_get_table_error(): Replaces row_mysql_get_table_status().
Display no message on DB_CORRUPTION; it should be properly reported at
the SQL layer anyway.
2024-09-06 14:29:09 +03:00
Oleksandr Byelkin
dd7d9d7fb1 Merge branch '11.4' into 11.5 2024-05-23 17:01:43 +02:00
Sergei Golubchik
f0a5412037 Merge branch '11.0' into 11.1 2024-05-13 09:52:30 +02:00
Sergei Golubchik
f9807aadef Merge branch '10.11' into 11.0 2024-05-12 12:18:28 +02:00
Sergei Golubchik
018d537ec1 Merge branch '10.6' into 10.11 2024-04-22 15:23:10 +02:00
Alexander Barkov
fd247cc21f MDEV-31340 Remove MY_COLLATION_HANDLER::strcasecmp()
This patch also fixes:
  MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
  MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
  MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
  MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
  MDEV-33088 Cannot create triggers in the database `MYSQL`
  MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
  MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
  MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
  MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
  MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0

- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER

- Adding a wrapper function CHARSET_INFO::streq(), to compare
  two strings for equality. For now it calls strnncoll() internally.
  In the future it will turn into a virtual function.

- Adding new accent sensitive case insensitive collations:
    - utf8mb4_general1400_as_ci
    - utf8mb3_general1400_as_ci
  They implement accent sensitive case insensitive comparison.
  The weight of a character is equal to the code point of its
  upper case variant. These collations use Unicode-14.0.0 casefolding data.

  The result of
     my_charset_utf8mb3_general1400_as_ci.strcoll()
  is very close to the former
     my_charset_utf8mb3_general_ci.strcasecmp()

  There is only a difference in a couple dozen rare characters, because:
    - the switch from "tolower" to "toupper" comparison, to make
      utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
    - the switch from Unicode-3.0.0 to Unicode-14.0.0
  This difference should be tolarable. See the list of affected
  characters in the MDEV description.

  Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
  Unlike utf8mb4_general_ci, it does not treat all BMP characters
  as equal.

- Adding classes representing names of the file based database objects:

    Lex_ident_db
    Lex_ident_table
    Lex_ident_trigger

  Their comparison collation depends on the underlying
  file system case sensitivity and on --lower-case-table-names
  and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.

- Adding classes representing names of other database objects,
  whose names have case insensitive comparison style,
  using my_charset_utf8mb3_general1400_as_ci:

  Lex_ident_column
  Lex_ident_sys_var
  Lex_ident_user_var
  Lex_ident_sp_var
  Lex_ident_ps
  Lex_ident_i_s_table
  Lex_ident_window
  Lex_ident_func
  Lex_ident_partition
  Lex_ident_with_element
  Lex_ident_rpl_filter
  Lex_ident_master_info
  Lex_ident_host
  Lex_ident_locale
  Lex_ident_plugin
  Lex_ident_engine
  Lex_ident_server
  Lex_ident_savepoint
  Lex_ident_charset
  engine_option_value::Name

- All the mentioned Lex_ident_xxx classes implement a method streq():

  if (ident1.streq(ident2))
     do_equal();

  This method works as a wrapper for CHARSET_INFO::streq().

- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
  in class members and in function/method parameters.

- Replacing all calls like
    system_charset_info->coll->strcasecmp(ident1, ident2)
  to
    ident1.streq(ident2)

- Taking advantage of the c++11 user defined literal operator
  for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
  data types. Use example:

  const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;

  is now a shorter version of:

  const Lex_ident_column primary_key_name=
    Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
2024-04-18 15:22:10 +04:00
Marko Mäkelä
829cb1a49c Merge 10.5 into 10.6 2024-04-17 14:14:58 +03:00
Oleksandr Byelkin
9b18275623 Merge branch '10.4' into 10.5 2024-04-16 11:04:14 +02:00
Vlad Lesin
d7fc975cfe MDEV-33802 Weird read view after ROLLBACK of other transactions.
In the case if some unique key fields are nullable, there can be
several records with the same key fields in unique index with at least
one key field equal to NULL, as NULL != NULL.

When transaction is resumed after waiting on the record with at least one
key field equal to NULL, and stored in persistent cursor record is
deleted, persistent cursor can be restored to the record with all key
fields equal to the stored ones, but with at least one field equal to
NULL. And such record is wrongly treated as a record with the same unique
key as stored in persistent cursor record one, what is wrong as
NULL != NULL.

The fix is to check if at least one unique field is NULL in restored
persistent cursor position, and, if so, then don't treat the record as
one with the same unique key as in the stored record key.

dict_index_t::nulls_equal was removed, as it was initially developed for
never existed in MariaDB "intrinsic tables", and there is no code, which
would set it to "true".

Reviewed by Marko Mäkelä.
2024-04-12 18:13:51 +03:00
Marko Mäkelä
683fbced6b Merge 11.0 into 11.1 2024-03-28 12:15:36 +02:00
Alexander Barkov
929c2e06aa MDEV-31531 Remove my_casedn_str() and my_caseup_str()
Under terms of MDEV 27490 we'll add support for non-BMP identifiers
and upgrade casefolding information to Unicode version 14.0.0.
In Unicode-14.0.0 conversion to lower and upper cases can increase octet length
of the string, so conversion won't be possible in-place any more.

This patch removes virtual functions performing in-place casefolding:
  - my_charset_handler_st::casedn_str()
  - my_charset_handler_st::caseup_str()
and fixes the code to use the non-inplace functions instead:
  - my_charset_handler_st::casedn()
  - my_charset_handler_st::caseup()
2024-02-28 22:20:29 +04:00
Marko Mäkelä
d73baa402a Merge 10.11 into 11.0 2024-02-20 12:02:01 +02:00
Marko Mäkelä
86c2c89743 Merge 10.6 into 10.11 2024-02-08 15:04:46 +02:00
Marko Mäkelä
77b4399545 MDEV-33421 innodb.corrupted_during_recovery fails due to error that the table is corrupted
This fixes up the merge commit 7e39470e33

dict_table_open_on_name(): Report ER_TABLE_CORRUPT in a consistent
fashion, with a pretty-printed table name.
2024-02-08 14:20:42 +02:00
Marko Mäkelä
b2654ba826 MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL
lock_table_children(): A new function to lock all child tables of a table.
We will only hold dict_sys.latch while traversing
dict_table_t::referenced_set. To prevent a race condition with
std::set::erase() we will copy the pointers to the child tables to a
local vector. Once we have acquired MDL and references to all child tables,
we can safely release dict_sys.latch, wait for the locks, and finally
release the references.

dict_acquire_mdl_shared(): A new variant that takes mdl_context as a
parameter.

lock_table_for_trx(): Assert that we are not holding dict_sys.latch.

ha_innobase::truncate(): When foreign_key_checks=ON, assert that
no child tables exist (other than the current table).
In any case, we will invoke lock_table_children()
so that the child table metadata can be safely updated.
(It is possible that a child table is being created concurrently
with TRUNCATE TABLE.)

ha_innobase::delete_table(): Before and after acquiring exclusive
locks on the current table as well as all child tables,
check that FOREIGN KEY constraints will not be violated.
In this way, we can reject impossible DROP TABLE without having to
wait for locks first.

This fixes up commit 2ca1123464 (MDEV-26217)
and commit c3c53926c4 (MDEV-26554).
2024-02-08 14:22:35 +11:00
Marko Mäkelä
5f2dcd112b MDEV-24167 fixup: srw_lock_debug instrumentation
While the index_lock and block_lock include debug instrumentation
to keep track of shared lock holders, such instrumentation was
never part of the simpler srw_lock, and therefore some users of the
class implemented a limited form of bookkeeping.

srw_lock_debug encapsulates srw_lock and adds the data members
writer, readers_lock, and readers to keep track of the threads that
hold the exclusive latch or any shared latches. The debug checks
are available also with SUX_LOCK_GENERIC (in environments that do not
implement a futex-like system call).

dict_sys_t::latch: Use srw_lock_debug in debug builds.
This makes the debug fields latch_ex, latch_readers redundant.

fil_space_t::latch: Use srw_lock_debug in debug builds.
This makes the debug field latch_count redundant.
The field latch_owner must be preserved, because
fil_space_t::is_owner() is being used in all builds.

lock_sys_t::latch: Use srw_lock_debug in debug builds.
This makes the debug fields writer, readers redundant.

lock_sys_t::is_holder(): A new debug predicate to check if
the current thread is holding lock_sys.latch in any mode.

trx_rseg_t::latch: Use srw_lock_debug in debug builds.
2024-02-08 14:22:35 +11:00
Sergei Golubchik
b6680e0101 Merge branch '11.0' into 11.1 2024-02-02 11:30:47 +01:00
Marko Mäkelä
35cc4b6c05 Merge 10.11 into 11.0 2024-01-22 10:10:50 +02:00
Marko Mäkelä
b3ca7fa089 Merge 10.6 into 10.11 2024-01-22 08:49:04 +02:00
Marko Mäkelä
21560bee9d Revert "MDEV-32899 InnoDB is holding shared dict_sys.latch while waiting for FOREIGN KEY child table lock on DDL"
This reverts commit 569da6a7ba,
commit 768a736174, and
commit ba6bf7ad9e
because of a regression that was filed as MDEV-33104.
2024-01-19 12:46:11 +02:00
Sergei Golubchik
7a5448f8da Merge branch '11.0' into 11.1 2023-12-19 20:11:54 +01:00
Sergei Golubchik
8c8bce05d2 Merge branch '10.11' into 11.0 2023-12-19 15:53:18 +01:00
Sergei Golubchik
fd0b47f9d6 Merge branch '10.6' into 10.11 2023-12-18 11:19:04 +01:00