Commit graph

79728 commits

Author SHA1 Message Date
Sergei Golubchik
b56ca29f89 MDEV-35105 Assertion `tab->join->order' fails upon vector search with DISTINCT
don't apply distinct optimization to order by a vector index
2024-11-05 14:00:51 -08:00
Sergei Golubchik
9ddb94f60e MDEV-35104 Invalid (old?) table or database name upon DDL on table with vector key and unique key
InnoDB rename needs the same workaround for hlindexes
as it has for partitions
2024-11-05 14:00:51 -08:00
Sergei Golubchik
7d9c0e4f62 MDEV-35092 Server crash, hang or ASAN errors in mysql_create_frm_image upon using non-default table options and system variables
extend the option_list expicitly on CREATE/ALTER, not implicitly
on parsing.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
cdc7253787 make MyISAM and Aria report correct reflength to the server
MyISAM and Aria used to lie to the server about the reflength value.
One value was used internally, it was stored on disk, e.g. in indexes,
and couldn't be changed without full table rebuild. A differently
calculated value was reported to the server - that value was sometimes
larger than the true reflength.

That caused the server to allocate more memory per position than
necessary - affecting filesort, join buffer usage, optimizer cost
calculations, and may be more.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
ea1e720391 MDEV-35078 Server crash or ASAN errors in mhnsw_insert
when adding a column or index that uses plugin-defined
sysvar-based options with ALTER TABLE ... ADD, the server
was using the default value of the sysvar, not the current one.

CREATE TABLE was correctly using the current sysvar value.

Fix it so that new columns/indexes added in ALTER TABLE ... ADD
would use a current sysvar value. Existing columns/indexes
in ALTER TABLE would keep using the default sysvar value
(unless they had an explicit value in frm).
2024-11-05 14:00:51 -08:00
Sergei Golubchik
855aefb7b5 mysqldump and mariadb-backup tests of vector indexes 2024-11-05 14:00:51 -08:00
Sergei Golubchik
eb4ab2ce8f MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak
disallow explicit XA PREPARE over mhnsw indexes
2024-11-05 14:00:51 -08:00
Sergei Golubchik
09cd817f5d MDEV-35060 Assertion failure upon DML on table with vector under lock 2024-11-05 14:00:51 -08:00
Sergei Golubchik
09889d417b MDEV-35055 ASAN errors in TABLE_SHARE::lock_share upon committing transaction after FLUSH on table with vector key
MHNSW_Trx cannot store a pointer to the TABLE_SHARE for the duration
of a transaction, because the share can be evicted from the cache any
time.

Use a pointer to the MDL_ticket instead, it is stored in the THD
and has a duration of MDL_TRANSACTION, so won't go away.

When we need a TABLE_SHARE - on commit - get a new one from tdc.
Normally, it'll be already in the cache, so it'd be fast.
We don't optimize for the edge case when TABLE_SHARE was evicted.
2024-11-05 14:00:51 -08:00
Sergei Golubchik
d396fb9226 MDEV-35021 Behavior for RTREE indexes changed, assertion fails
disallow USING RTREE for not SPATIAL index
2024-11-05 14:00:51 -08:00
Sergei Golubchik
b3afd9f640 MDEV-35042 Vector indexes are allowed for MERGE tables, but do not
disallow hlindexes in MERGE - because we cannot create the secondary
table *in the same engine*
2024-11-05 14:00:51 -08:00
Sergei Golubchik
0932c3a27e MDEV-35044 ALTER on a table with vector index attempts to bypass unsupported locking limitation, server crashes in THD::free_tmp_table_share
open secondary tables early enough for the cleanup on error to see
them and remove their underlying files
2024-11-05 14:00:51 -08:00
Sergei Golubchik
824a63852b MDEV-35043 Unsuitable error upon an attempt to create MEMORY table with vector key
MEMORY engine doesn't support blobs
2024-11-05 14:00:51 -08:00
Sergei Golubchik
9f80e3fbb7 MDEV-35032 streaming mode for mhnsw search
support SQL semantics for SELECT ... WHERE ... ORDER BY ... LIMIT

* switch from returning k nearest neighbors to returning
  as many as needed, in k-neighbor chunks, with increasing distance
* make search_layer() skips nodes that are closer than a threshold
* read_next keeps a search context - list of k found nodes,
  threshold, ctx, etc.
* when the list of found nodes is exhausted, it repeats the search
  starting from last found nodes and a threshold
* search context kepts ctx->refcount incremented, so ctx won't go away
* but commit_lock is unlocked between calls, so InnoDB can modify the table
* use ctx version to detect that, switch to MHNSW_Trx when it happens

bugfix:
* use the correct lock in ha_external_lock() for the graph table
* InnoDB didn't reset locks on ha_external_lock(F_UNLCK) and previous
  LOCK_X leaked into the next statement
2024-11-05 14:00:51 -08:00
Sergei Golubchik
be69716287 MDEV-35029 ASAN errors in Lex_ident<Compare_ident_ci>::is_valid_ident upon DDL on table with vector index
in ALTER TABLE or CREATE TABLE LIKE, a create a copy of key->option_list,
because it can be extended later on the thd->mem_root, so it has
to be a copy
2024-11-05 14:00:51 -08:00
Sergei Golubchik
a6499049af MDEV-35033 LeakSanitizer errors in my_malloc / safe_mutex_lazy_init_deadlock_detection / MHNSW_Context::alloc_node and alike
ctx wasn't released on errors
2024-11-05 14:00:51 -08:00
Sergei Golubchik
6837bb54f4 MDEV-35020 After a failed attempt to create vector index temporary file remains and prevents further operation 2024-11-05 14:00:50 -08:00
Sergei Golubchik
ec2ff9f2a0 MDEV-35035 Assertion failure in ha_blackhole::position upon INSERT into blackhole table with vector index
let's allow ::position() and ::rnd_pos() in blackhole.
::position() can be called directly after insert, it doesn't need
a search to happen, so it's possible.

::rnd_pos() can be called with a value that ::position() produced,
so, possible too.
2024-11-05 14:00:50 -08:00
Sergei Golubchik
8ac3f0b1d4 MDEV-35038 Server crash in Index_statistics::get_avg_frequency upon EITS collection for vector index
don't collect eits for vector indexes
2024-11-05 14:00:50 -08:00
Sergei Golubchik
0bd01f4a95 MDEV-35039 Number of indexes inside InnoDB differs from that defined in MariaDB after altering table with vector key
don't show table->s->total_keys to engine in inplace alter
2024-11-05 14:00:50 -08:00
Sergei Golubchik
8253650aaa MDEV-35006 Using varbinary as vector-storing column results in assertion failures
* hlindexes cannot be extended with pk
* hlindexes cannot be covering
2024-11-05 14:00:50 -08:00
Sergei Golubchik
fb04cad37e trying to stabilize floating-point tests 2024-11-05 14:00:50 -08:00
Sergey Vojtovich
f867c2a21e Disabled high-level indexes with Aria
... until a few bugs that cause server crash are fixed.
2024-11-05 14:00:50 -08:00
Sergey Vojtovich
ca17b68bb6 ALTER TABLE fixes for high-level indexes (iii)
quick_rm_table() expects .frm to exist when it removes high-level indexes.
For cases like ALTER TABLE t1 RENAME TO t2, ENGINE=other_engine .frm was
removed earlier.

Another option would be removing high-level indexes explicitly before the
first quick_rm_table() and skipping high-level indexes for subsequent
quick_rm_table(NO_FRM_RENAME).

But this suggested order may also help with ddl log recovery. That is
if we crash before high-level indexes are removed, .frm is going to
exist.
2024-11-05 14:00:50 -08:00
Sergey Vojtovich
7aa6bb3aa3 ALTER TABLE fixes for high-level indexes (ii)
Disable non-copy ALTER algorithms when VECTOR index is affected. Engines
are not supposed to handle high-level indexes anyway.

Also fixed misbehaving IF [NOT] EXISTS variants.
2024-11-05 14:00:50 -08:00
Sergey Vojtovich
a90fa3f397 ALTER TABLE fixes for high-level indexes (i)
Fixes for ALTER TABLE ... ADD/DROP COLUMN, ALGORITHM=COPY.

Let quick_rm_table() remove high-level indexes along with original table.

Avoid locking uninitialized LOCK_share for INTERNAL_TMP_TABLEs.

Don't enable bulk insert when altering a table containing vector index.
InnoDB can't handle situation when bulk insert is enabled for one table
but disabled for another. We can't do bulk insert on vector index as it
does table updates currently.
2024-11-05 14:00:50 -08:00
Sergei Golubchik
2ad9df8c9b VEC_Distance_Cosine() 2024-11-05 14:00:50 -08:00
Sergei Golubchik
2e1fcc6a80 rename VEC_Distance to VEC_Distance_Euclidean
and create a parent Item_func_vec_distance_common class
2024-11-05 14:00:50 -08:00
Sergei Golubchik
0da820cb12 mhnsw: use plugin index options and transaction_participant API 2024-11-05 14:00:50 -08:00
Sergei Golubchik
126d6d787c cleanup: handlerton
remove unused methods, reorder methods, add comments
2024-11-05 14:00:50 -08:00
Sergei Golubchik
8087cefc07 make rename test to work with InnoDB too 2024-11-05 14:00:50 -08:00
Sergei Golubchik
445198c10e pos-fixes for rename 2024-11-05 14:00:50 -08:00
Sergey Vojtovich
97e112fb82 VECTOR indexes support for RENAME TABLE
Rename high-level indexes along with a table.
2024-11-05 14:00:49 -08:00
Sergei Golubchik
ebcbed6d74 post-fixes for TRUNCATE
* fix the truncate-by-handler variant, used by InnoDB
* test that insert works after truncate, meaning graph table was emptied
* test that the vector index size is zero after truncate in MyISAM
2024-11-05 14:00:49 -08:00
Sergey Vojtovich
70575defb7 Fixed TRUNCATE TABLE against VECTOR indexes
This patch fixes only TRUNCATE by recreate variant, there seem to be no
reasonable engine that uses TRUNCATE by handler method for testing.

Reset index_cinfo so that mi_create is not confused by garbage passed via
index_file_name and sets MY_DELETE_OLD flag.

Review question: can we add a test case to make sure VECTOR index is empty
indeed?
2024-11-05 14:00:49 -08:00
Sergey Vojtovich
91a24ddc5d Test insert ... select with vector index 2024-11-05 14:00:49 -08:00
Sergey Vojtovich
4aa1968b89 Disable VECTOR indexes with partitioned tables 2024-11-05 14:00:49 -08:00
Sergey Vojtovich
7c16bba71d CREATE TABLE ... LIKE loses VECTOR index 2024-11-05 14:00:49 -08:00
Vicențiu Ciorbaru
eec1339f5d MDEV-32886 Vec_FromText and Vec_ToText
This commit introduces two utility functions meant to make working with
vectors simpler.

Vec_ToText converts a binary vector into a json array of numbers
(floats).
Vec_FromText takes in a json array of numbers and converts it into a
little-endian IEEE float sequence of bytes (4 bytes per float).
2024-11-05 14:00:49 -08:00
Sergei Golubchik
f44989ff0f UPDATE/DELETE post-fixes 2024-11-05 14:00:49 -08:00
Hugo Wen
0e2b9e7621 MDEV-33408 Initial support for vector DELETE and UPDATE
When the source row is deleted, mark the corresponding node in HNSW
index by setting `tref` to null. An index is added for the `tref` in
secondary table for faster searching of the to-be-marked nodes.

The nodes marked as deleted will still be used for search, but will not
be included in the final query results.

As skipping deleted nodes and not adding deleted nodes for new-inserted
nodes' neighbor list could impact the performance, we now only skip
these nodes in search results.

- for some reason the bitmap is not set for hlindex during the delete so
  I had to temporarily comment out one line

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2024-11-05 14:00:49 -08:00
Sergei Golubchik
049d839350 mhnsw: inter-statement shared cache
* preserve the graph in memory between statements
* keep it in a TABLE_SHARE, available for concurrent searches
* nodes are generally read-only, walking the graph doesn't change them
* distance to target is cached, calculated only once
* SIMD-optimized bloom filter detects visited nodes
* nodes are stored in an array, not List, to better utilize bloom filter
* auto-adjusting heuristic to estimate the number of visited nodes
  (to configure the bloom filter)
* many threads can concurrently walk the graph. MEM_ROOT and Hash_set
  are protected with a mutex, but walking doesn't need them
* up to 8 threads can concurrently load nodes into the cache,
  nodes are partitioned into 8 mutexes (8 is chosen arbitrarily, might
  need tuning)
* concurrent editing is not supported though
* this is fine for MyISAM, TL_WRITE protects the TABLE_SHARE and the
  graph (note that TL_WRITE_CONCURRENT_INSERT is not allowed, because an
  INSERT into the main table means multiple UPDATEs in the graph)
* InnoDB uses secondary transaction-level caches linked in a list in
  in thd->ha_data via a fake handlerton
* on rollback the secondary cache is discarded, on commit nodes
  from the secondary cache are invalidated in the shared cache
  while it is exclusively locked
* on savepoint rollback both caches are flushed. this can be improved
  in the future with a row visibility callback
* graph size is controlled by @@mhnsw_cache_size, the cache is flushed
  when it reaches the threshold
2024-11-05 14:00:49 -08:00
Sergei Golubchik
5c2b7c6e7f mhnsw: configurable parameters
1. introduce alpha. the value of 1.1 is optimal, so hard-code it.

2. hard-code ef_construction=10, best by test

3. rename hnsw_max_connection_per_layer to mhnsw_max_edges_per_node
   (max_connection is rather ambiguous in MariaDB) and add a help text

4. rename hnsw_ef_search to mhnsw_min_limit and add a help text
2024-11-05 14:00:49 -08:00
Sergei Golubchik
25b4000290 InnoDB support for hlindexes and mhnsw
* mhnsw:
  * use primary key, innodb loves and (and the index cannot have dupes anyway)
    * MyISAM is ok with that, performance-wise
  * must be ha_rnd_init(0) because we aren't going to scan
    * MyISAM resets the position on ha_rnd_init(0) so query it before
    * oh, and use the correct handler, just in case
  * HA_ERR_RECORD_IS_THE_SAME is no error
* innodb:
  * return ref_length on create
  * don't assume table->pos_in_table_list is set
  * ok, assume away, but only for system versioned tables
* set alter_info on create (InnoDB needs to check for FKs)
* pair external_lock/external_unlock correctly
2024-11-05 14:00:49 -08:00
Sergei Golubchik
3ff7f04fd4 misc changes
* sysvars should be REQUIRED_ARG
* fix a mix of US and UK spelling (use US)
* use consistent naming
* work if VEC_DISTANCE arguments are in the swapped order (const, col)
* work if VEC_DISTANCE argument is NULL/invalid or wrong length
* abort INSERT if the value is invalid or wrong length
* store the "number of neighbors" in a blob in endianness-independent way
* use field->store(longlong, bool) not field->store(double)
* a lot more error checking everywhere
* cleanup after errors
* simplify calling conventions, remove reinterpret_cast's
* todo/XXX comments
* whitespaces
* use float consistently

memory management is still totally PoC quality
2024-11-05 14:00:48 -08:00
Vicențiu Ciorbaru
88839e71a3 Initial HNSW implementation
This commit includes the work done in collaboration with Hugo Wen from
Amazon:

    MDEV-33408 Alter HNSW graph storage and fix memory leak

    This commit changes the way HNSW graph information is stored in the
    second table. Instead of storing connections as separate records, it now
    stores neighbors for each node, leading to significant performance
    improvements and storage savings.

    Comparing with the previous approach, the insert speed is 5 times faster,
    search speed improves by 23%, and storage usage is reduced by 73%, based
    on ann-benchmark tests with random-xs-20-euclidean and
    random-s-100-euclidean datasets.

    Additionally, in previous code, vector objects were not released after
    use, resulting in excessive memory consumption (over 20GB for building
    the index with 90,000 records), preventing tests with large datasets.
    Now ensure that vectors are released appropriately during the insert and
    search functions. Note there are still some vectors that need to be
    cleaned up after search query completion. Needs to be addressed in a
    future commit.

    All new code of the whole pull request, including one or several files
    that are either new files or modified ones, are contributed under the
    BSD-new license. I am contributing on behalf of my employer Amazon Web
    Services, Inc.

As well as the commit:

    Introduce session variables to manage HNSW index parameters

    Three variables:

    hnsw_max_connection_per_layer
    hnsw_ef_constructor
    hnsw_ef_search

    ann-benchmark tool is also updated to support these variables in commit
    https://github.com/HugoWenTD/ann-benchmarks/commit/e09784e for branch
    https://github.com/HugoWenTD/ann-benchmarks/tree/mariadb-configurable

    All new code of the whole pull request, including one or several files
    that are either new files or modified ones, are contributed under the
    BSD-new license. I am contributing on behalf of my employer Amazon Web
    Services, Inc.

Co-authored-by: Hugo Wen <wenhug@amazon.com>
2024-11-05 14:00:48 -08:00
Sergei Golubchik
d6add9a03d initial support for vector indexes
MDEV-33407 Parser support for vector indexes

The syntax is

  create table t1 (... vector index (v) ...);

limitation:
* v is a binary string and NOT NULL
* only one vector index per table
* temporary tables are not supported

MDEV-33404 Engine-independent indexes: subtable method

added support for so-called "high level indexes", they are not visible
to the storage engine, implemented on the sql level. For every such
an index in a table, say, t1, the server implicitly creates a second
table named, like, t1#i#05 (where "05" is the index number in t1).
This table has a fixed structure, no frm, not accessible directly,
doesn't go into the table cache, needs no MDLs.

MDEV-33406 basic optimizer support for k-NN searches

for a query like SELECT ... ORDER BY func() optimizer will use
item_func->part_of_sortkey() to decide what keys can be used
to resolve ORDER BY.
2024-11-05 14:00:48 -08:00
Sergei Golubchik
aa09cb3b11 open frm for DROP TABLE
needed to get partitioning and information about
secondary objects
2024-11-05 14:00:48 -08:00
Sergei Golubchik
1fe8a1bb76 cleanup: generalize ER_INNODB_NO_FT_TEMP_TABLE 2024-11-05 14:00:48 -08:00
Sergei Golubchik
fd69abe44f cleanup: generalize ER_SPATIAL_CANT_HAVE_NULL 2024-11-05 14:00:48 -08:00
Sergei Golubchik
062f8eb37d cleanup: key algorithm vs key flags
the information about index algorithm was stored in two
places inconsistently split between both.

BTREE index could have key->algorithm == HA_KEY_ALG_BTREE, if the user
explicitly specified USING BTREE or HA_KEY_ALG_UNDEF, if not.

RTREE index had key->algorithm == HA_KEY_ALG_RTREE
and always had key->flags & HA_SPATIAL

FULLTEXT index had  key->algorithm == HA_KEY_ALG_FULLTEXT
and always had key->flags & HA_FULLTEXT

HASH index had key->algorithm == HA_KEY_ALG_HASH or HA_KEY_ALG_UNDEF

long unique index always had key->algorithm == HA_KEY_ALG_LONG_HASH

In this commit:

All indexes except BTREE and HASH always have key->algorithm
set, HA_SPATIAL and HA_FULLTEXT flags are not used anymore (except
for storage to keep frms backward compatible).

As a side effect ALTER TABLE now detects FULLTEXT index renames correctly
2024-11-05 14:00:47 -08:00
Sergei Golubchik
44ff2f7831 reject invalid spatial key declarations in the parser 2024-11-05 14:00:47 -08:00
Sergei Golubchik
9fa31c1bd9 cleanup: spaces, casts, comments 2024-11-05 14:00:47 -08:00
Sergei Golubchik
4f4c5a2ba9 fix a typo and an old bug in prefschema.transaction test 2024-11-05 14:00:47 -08:00
Sergei Golubchik
70f000f1dc fix main.plugin_vars test to cleanup after itself 2024-11-05 14:00:46 -08:00
Sergei Golubchik
9ddac64188 make INFORMATION_SCHEMA.STATISTICS.COMMENT not nullable
as it can never be null (only "" or "disabled")
2024-11-05 14:00:46 -08:00
Sergei Golubchik
680bdb76a6 fix for 32bit 2024-11-05 14:00:46 -08:00
Vladislav Vaintroub
faf9e755ba MDEV-35109 fix test case
rpl_semi_sync_after_sync_coord_consistency fails on release compilation
2024-11-05 22:38:55 +01:00
Vladislav Vaintroub
37b7986467 Merge branch '10.5' into 10.6 2024-11-05 21:02:22 +01:00
Alexander Barkov
7741065936 MDEV-23895 Server crash, ASAN heap-buffer-overflow or Valgrind Invalid write in Item_func_rpad::val_str
Item_cache_int::val_str() and Item_cache_real::val_str() erroneously
used default_charset(). Fixing to return my_charset_numeric instead.
2024-11-05 12:36:08 +04:00
Oleg Smirnov
a914087fab MDEV-35307 Unexpected error WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area #2
When strict mode is enabled, all warnings during `INSERT` are
converted to errors regardless of their actual severity.
`WARN_SORTING_ON_TRUNCATED_LENGTH` is not considered severe enough
to be elevated to the ERROR level, and this commit fixes that
2024-11-05 14:52:20 +07:00
Alexander Barkov
eb41c1171e MDEV-33942 View cuts off the end of string with the utf8 character set in INSERT function
Item_func_insert::fix_length_and_dec() incorrectly calculated max_length
when its collation.collation evaluated to my_charset_bin.

Fixing the code to calculate max_length in terms of octets rather
than in terms of characters when collation.collation is my_charset_bin.
2024-11-05 11:16:10 +04:00
Alexander Barkov
c2bf1d4781 MDEV-29552 LEFT and RIGHT with big value for parameter 'len' >0 return empty value in view
The code in max_length_for_string() erroneously returned 0
for huge numbers like 4294967295.

Rewriting the code in a more straightforward way.
2024-11-05 09:19:05 +04:00
Brandon Nesterenko
b07258a0d5 MDEV-35109: Semi-sync Replication stalling Primary using wait point=AFTER_SYNC
For a primary configured with wait_point=AFTER_SYNC, if two threads
T1 (binlogging through MYSQL_BIN_LOG::write()) and T2 were
binlogging at the same time, T1 could accidentally wait for its
semi-sync ACK using the binlog coordinates of T2. Prior to
MDEV-33551, this only resulted in delayed transactions, because all
transactions shared the same condition variable for ACK signaling.
However, with the MDEV-33551 changes, each thread has its own
condition variable to signal. So T1 could wait indefinitely when
either:
  1) T1's ACK is received but not T2's when T1 goes into
wait_after_sync(), because the ACK receiver thread has already
notified about the T1 ACK, but T1 was _actually_ waiting on T2's
ACK, and therefore tries to wait (in vain).

  2) T1 goes to wait_after_sync() before any ACKs have arrived. When
T1's ACK comes in, T1 is woken up; however, sees it needs to wait
more (because it was actually waiting on T2's ACK), and goes to wait
again (this time, in vain).

Note that the actual cause of T1 waiting on T2's binlog coordinates
is when MYSQL_BIN_LOG::write() would call
Repl_semisync_master::wait_after_sync(), the binlog offset parameter
was read as the end of MYSQL_BIN_LOG::log_file, which is shared
among transactions. So if T2 had updated the binary log _after_ T1
had released LOCK_log, but not yet invoked wait_after_sync(), it
would use the end of the binary log file as the binlog offset, which
was that of T2 (or any future transaction).

The fix in this patch ensures consistency between the binary log
coordinates a transaction uses between report_binlog_update() and
wait_after_sync().

Reviewed By
============
Kristian Nielsen <knielsen@knielsen-hq.org>
Andrei Elkin <andrei.elkin@mariadb.com>
2024-11-04 10:45:58 -07:00
Brandon Nesterenko
5290fa043b MDEV-35109 PREP: simulate_delay_semisync_slave_reply use debug_sync
This is a preparatory commit for MDEV-35109 to make its
testing code cleaner (and harden other tests too).

The DEBUG_DBUG point simulate_delay_semisync_slave_reply
up to this patch used my_sleep() to delay an ACK response,
but sleeps are prone to test failures on machines that
run tests when already having a heavy load (e.g. on
buildbot).

This patch changes this DEBUG_DBUG sleep to use DEBUG_SYNC
to coordinate exactly when a slave should send its reply,
which is safer and faster.

As DEBUG_SYNC can't be used while a server is shutting
down, to synchronize threads with SHUTDOWN WAIT FOR SLAVES
logic, we use and extend wait_for_pattern_in_file.inc to
wait for an informational error message in the logic to
indicate that the shutdown process has reached the
intended state (i.e. indicating that the shutdown has
been delayed to await semi-sync ACKs). Specifically, the
extensions are as follows:

 1. wait_for_pattern_in_file.inc is extended with parameter
    wait_for_pattern_count as a number that indicates the
    number of times a pattern should occur in the file before
    return control back to the calling script.

 2. search_for_pattern_in_file.inc is extended with parameter
    SEARCH_ABORT_IS_SUCCESS to inverse the error/success
    logic, so the SEARCH_ABORT condition can be used to
    indicate success, rather than error.
2024-11-04 10:45:58 -07:00
Oleksandr Byelkin
a37f71bd10 Merge branch '10.11' into mariadb-10.11.10 2024-11-04 07:42:26 +01:00
Oleksandr Byelkin
f2bb2ab58c Merge branch '10.6' into mariadb-10.6.20 2024-11-04 07:40:45 +01:00
Oleksandr Byelkin
ecdccddaae Merge branch '10.5' into mariadb-10.5.27 2024-11-04 07:35:28 +01:00
Alexander Barkov
e60fd6c204 MDEV-28767 Collation "binary" is not accepted for databases, tables, columns
MariaDB in a COLLATE clause supported 'binary' only as an identifier:
  COLLATE `binary`

Fixing the parser to understand 'binary' as a keyword:
  COLLATE binary

This is for MySQL compatibility.
2024-11-02 12:47:28 +04:00
Alexander Barkov
5d4a4d2091 Fixing main.type_timestamp failure with --view
The patch for
  MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length
added a test which depends on
  MDEV-29534 In view FROM_UNIXTIME adds .000000 in the result

Adding --disable_view_protocol around the affected statements.
2024-11-02 12:46:27 +04:00
Sergei Golubchik
ac7fe8b214 fix main.selectivity_notembedded --view 2024-11-01 21:01:31 +01:00
Sergei Golubchik
0a3452cf83 MDEV-35229 fix the test for --view
also, enable it for --ps
2024-11-01 20:52:58 +01:00
Alexander Barkov
d661bc1552 MDEV-20944 Wrong result of LEAST() and ASAN heap-use-after-free in my_strnncollsp_simple / Item::temporal_precision on TIME()
The code tried to avoid String::copy() but did it in a wrong way,
so asan detected heap-use-after-free errors.
Removing the wrong optimization, using copy() instead.
2024-11-01 15:55:09 +04:00
Alexander Barkov
dd41be2a51 MDEV-29184 Assertion `0' in Item_row::illegal_method_call, Type_handler_row::Item_update_null_value, Item::update_null_value
- Moving the check_cols(1) test from fix_fields() to fix_length_and_dec().
  So the test is now done before the code calling val_decimal()
  in fix_length_and_dec().

- Removing Item_func_interval::fix_fields(), as it become equal
  to the inherited one.
2024-11-01 12:40:43 +04:00
Sergei Golubchik
947de4b1db print more digits for floating point options in in mariadbd --help 2024-11-01 08:58:43 +01:00
Monty
40810baffe MDEV-33144 Implement the Percona variable slow_query_log_always_write_time
This task is inspired by the Percona implementation of
slow_query_log_always_write_time.

This task implements the variable log_slow_always_query_time (name
matching other MariaDB variables using the slow query log). The
default value for the variable is 31536000, which makes MariaDB
compatible with older installations.

For queries with execution time longer than log_slow_always_query_time
the variables log_slow_rate_limit and log_slow_min_examined_row_limit
will be ignored and the query will be written to the slow query log
if there is no other limitations (like log_slow_filter etc).

Other things:
- long_query_time internal variable renamed to log_slow_query_time.
- More descriptive information for "log_slow_query_time".
2024-11-01 08:58:37 +01:00
Brandon Nesterenko
e9a502df08 Testing fix for rpl_semi_sync_cond_var_per_thd failure 2024-10-30 08:32:19 -06:00
Oleksandr Byelkin
c770bce898 Merge branch '11.2' into 11.4 2024-10-30 15:11:17 +01:00
Oleg Smirnov
bf9662f6fa MDEV-35275 Unexpected WARN_SORTING_ON_TRUNCATED_LENGTH or assertion failure in diagnostics area
MDEV-27277 added warnings on truncation during sorting for SELECTs
but did not for DML operations. However, UPDATEs and DELETEs may also
perform sorting and thus produce warnings. This commit fixes that
2024-10-30 18:47:11 +07:00
Alexander Barkov
556a40dce0 MDEV-35229 NOCOPY has become reserved word bringing wide incompatibility
This patch was suggested by Sergei Golubchik.

It reverts the second patch from the PR:

  commit fa5eeb4931
    Fixed ALTER TABLE NOCOPY keyword failure

and adds NOCOPY_SYM into keyword_func_sp_var_and_label.

The price is one extra shift/recuce conflict in yy_oracle.yy.
This should to tolerable.
2024-10-30 13:58:20 +04:00
Oleg Smirnov
c3a7a3c7a2 MDEV-34665 Simplify IN predicate processing for NULL-aware materialization involving only one column
It was found that unnecessary work of building Ordered_key structures
is being done when processing NULL-aware materialization for IN predicates
having only one column. In fact, the logic for that simplified case can be
expressed as follows.
Say we have predicate left_expr IN (SELECT <subq1>), where left_expr is
scalar(not a tuple).
Then
    if (left_expr is NULL) {
      if (subq1 produced any rows) {
        // note that we don't care if subq1 has produced
        // NULLs or not.
        NULL IN (<some values>) -> UNKNOWN, i.e. NULL.
      } else {
        NULL IN ({empty-set}) -> FALSE.
      }
    } else {
      // left_expr is a non-NULL value
      if (subq1 output has a match for left_expr) {
        left_expr IN (..., left_expr ...) -> TRUE
      } else {
        // no "known" matches.
        if (subq1 output has a NULL) {
          left_expr IN ( ... NULL ...) ->
           (NULL could have been a match or not)
           -> NULL.
        } else {
          // subq1 didn't produce any "UNKNOWNs" so
          // we're positive there weren't any matches
          -> FALSE.
        }
      }
    }

This commit introduces subselect_single_column_partial_engine class
implementing the logic described.

Reviewer: Sergei Petrunia <sergey@mariadb.com>
2024-10-30 16:48:36 +07:00
Alexander Barkov
4c7cfd2624 MDEV-34817 perfschema.lowercase_fs_off fails on buildbot
I pushed this patch in a mistake to 11.7 instead of 11.6.
Backporting from 11.7.

This is a workaround patch to make buildbot green.

Renaming databases from db1/DB2 to m33020_db1/m33020_DB1
to make them unique. So the garbage left by other tests
does not show up any more.

The real problem will be fixed under terms of:
  MDEV-35282 Performance schema does not clear package routines
2024-10-30 13:47:19 +04:00
Alexander Barkov
8c0a260a5b MDEV-35250 Assertion `dec <= 6' failed in my_timestamp_binary_length
The TIMESTAMP related code did not handle AUTO_SEC_PART_DIGITS.
FROM_UNIXTIME() sets its member 'decimals' to AUTO_SEC_PART_DIGITS.
So some scripts involving FROM_UNIXTIME() crashed on assert in debug
builds and returned unexpected results in release builds.
2024-10-30 11:09:21 +04:00
Alexander Barkov
a79f314f1b MDEV-34817 perfschema.lowercase_fs_off fails on buildbot
This is a workaround patch to make buildbot green.

Renaming databases from db1/DB2 to m33020_db1/m33020_DB1
to make them unique. So the garbage left by other tests
does not show up any more.

The real problem will be fixed under terms of:
  MDEV-35282 Performance schema does not clear package routines
2024-10-30 10:21:29 +04:00
Oleksandr Byelkin
69d033d165 Merge branch '10.11' into 11.2 2024-10-29 16:42:46 +01:00
Aleksey Midenkov
cc183489da MDEV-27293 Allow converting a versioned table from implicit
to explicit row_start/row_end columns

In case of adding both system fields of same type (length, unsigned
flag) as old implicit system fields do the rename of implicit system
fields to the ones specified in ALTER, remove SYSTEM_INVISIBLE flag in
that case. Correct PERIOD clause must be specified in ALTER as well.

MDEV-34904 Inplace alter for implicit to explicit versioning is broken

Whether ALTER goes inplace and how it goes inplace depends on
handler_flags which goes from alter_info->flags by this logic:

  ha_alter_info->handler_flags|= (alter_info->flags & ~flags_to_remove);

ALTER_VERS_EXPLICIT was not in flags_to_remove and its value (1ULL <<
35) clashed with ALTER_ADD_NON_UNIQUE_NON_PRIM_INDEX.

ALTER_VERS_EXPLICIT must not affect inplace, it is SQL-only so we
remove it from handler_flags.
2024-10-29 17:46:40 +03:00
Oleksandr Byelkin
3d0fb15028 Merge branch '10.6' into 10.11 2024-10-29 15:24:38 +01:00
Sergei Golubchik
5e5c3c7cb6 post-merge changes
* remove duplicate test file
* move all uuidv7 tests into plugin/type_uuid/mysql-test/type_uuid/
* remove mysys/ changes
* auto my_random_bytes() fallback - removes duplicate code from uuid,
  and fixes all other users of my_random_bytes() that don't check
  the return value (because, perhaps, they don't need crypto-strong
  random bytes)
* End of 11.6 -> 11.7 in tests
* clarify the warning text
* UUID_VERSION_MASK()/UUID_VARIANT_MASK() must not depend on the version
* allow 4x more monotonic uuidv7 per millisecond - instead of stretching
  1000 microseconds over 12 bits, let's use extra 2 bits as a counter
* rename for compatibility with Percona Server (uuid_v4, uuid_v7)
2024-10-29 14:47:32 +01:00
StefanoPetrilli
2fe269fdcb MDEV-32637 Implement native UUID7 function 2024-10-29 14:47:32 +01:00
Oleksandr Byelkin
f00711bba2 Merge branch '10.5' into 10.6 2024-10-29 14:20:03 +01:00
Oleksandr Byelkin
6aa47fae30 MDEV-35276 Assertion failure in find_producing_item upon a query from a view
Two problem solved:

1) Item_default_value makes a shallow copy so the copy
 should not delete field belong to the Item

2) Item_default_value should not inherit
 derived_field_transformer_for_having and
 derived_field_transformer_for_where (in this variant
 pushing DEFAULT(f) is prohibited (return NULL) but
 if return "this" it will be allowed (should go with
 a lot of tests))
2024-10-29 11:44:43 +01:00
Thirunarayanan Balathandayuthapani
db3be9b434 MDEV-35237 Bulk insert fails to apply buffered operation during CREATE..SELECT statement
Problem:
=======
- InnoDB fails to write the buffered insert operation during
create..select operation. This happens when bulk_insert
in transaction is reset to false while unlocking a source table.

Fix:
===
- InnoDB should apply the previous buffered changes to
all tables if we encounter any statement other than
pure INSERT or INSERT..SELECT statement in
ha_innobase::external_lock() and start_stmt().

- Remove the function bulk_insert_apply_for_table()

start_stmt(), external_lock(): Assert that trx->duplicates
should be enabled during bulk insert operation
2024-10-29 15:03:23 +05:30
Alexander Barkov
a88c71b294 MDEV-35041 Simple comparison causes "Illegal mix of collations" even with default server settings
The task "MDEV-25829 Change default Unicode collation to uca1400_ai_ci"
previously changed collation derivation for string user variables
from DERIVATION_EXPLICIT to DERIVATION_COERCIBLE, to resolve illegal
collation mix conflicts between table columns and user variables
when they have different collations.

However, DERIVATION_COERCIBLE was a wrong choice because it caused
conflicts between string literals and user variables when they have
different collations.

Adding a new collation derivation level DERIVATION_USERVAR.
This makes the collation of a user variable:
- weaker than a table column (like it was intended by MDEV-25829)
- but stronger than a literal (like it was in pre-MDEV-25829)

Cleanup in sql_type.h:
  Removing the line "- BINARY(expr)" from the before-DERIVATION_CAST
  comment, as it was on a wrong place. It's also listed on the correct
  place before DERIVATION_IMPLICIT.
2024-10-28 16:30:49 +04:00
Monty
066f920484 MDEV-35110 Deadlock on Replica during BACKUP STAGE BLOCK_COMMIT on XA transactions
This is an extension of MDEV-30423 "Deadlock on Replica during BACKUP
STAGE BLOCK_COMMIT on XA transactions"

The original commit in MDEV-30423 was not complete as some usage in XA of
MDL_BACKUP_COMMIT locks did not set thd->backup_commit_lock.
This is required to be set when using parallel replication.

Fixed by ensuring that all usage of BACKUP_COMMIT lock i XA is uniform and
all sets thd->backup_commit_lock. I also changed all locks to be
MDL_EXPLICIT to keep also that part uniform.

A regression test is added.
2024-10-28 13:29:21 +02:00
Vladislav Vaintroub
4b068b7fcb MDEV-32387 - prevent output during temporary changes to STDOUT/STDERR
create_process() temporarily changes STDOUT/STDERR output to error log file

This might redirect mtr output on Windows, so avoid it by holding
flush_lock.
2024-10-28 10:45:50 +01:00
Marko Mäkelä
63a7e4c96b MDEV-35257 Backup fails during an ALTER TABLE with FULLTEXT INDEX
In commit 1c55b845e0 (MDEV-32932) the
test mariabackup.innodb_ddl_on_intermediate_table was introduced but
disabled.

xb_load_single_table_tablespace(): Properly handle missing FTS_ tables.

backup_file_op_fail(): Properly handle FILE_DELETE records.
2024-10-28 07:44:18 +02:00
Sergei Petrunia
284593413f MDEV-35253: xa_prepare_unlock_unmodified fails: shift exponent 32 is too large
The code in best_access_path() uses PREV_BITS(uint, N) to
compute a bitmap of all keyparts: {keypart0, ... keypart{N-1}).

The problem is that PREV_BITS($type, N) macro code can't handle the case
when N=<number of bits in $type).
Also, why use PREV_BITS(uint, ...) for key part map computations when
we could have used PREV_BITS(key_part_map) ?

Fixed both:
- Change PREV_BITS(type, N) to handle any N in [0; n_bits(type)].
- Change PREV_BITS() to use key_part_map when computing key_part_map bitmaps.
2024-10-25 18:02:14 +03:00
Yuchen Pei
b8c2bd9f69
MDEV-35249 Fix regression caused by MDEV-34447
MDEV-34447 Removed setting first_cond_optimization to 0 in update and
delete when leaf_tables_saved. This can cause problems when two ps
executions of an update go through different paths, where the first ps
execution goes through single table update only and the second ps
execution also goes through multi table update. When this happens, the
first_cond_optimization of the outer query is not set to false during
the first ps execution because optimize() is not called for the outer
query. But then the second ps execution will call optimize() on the
outer query, which with first_cond_optimization==true trips the 2nd ps
mem leak detection.

This is not a problem in higher version as both executions go through
multi table updates, possibly due to MDEV-28883.

We fix this problem by restoring the FALSE assignments to
first_cond_optimization.
2024-10-25 18:03:40 +11:00
Sergei Golubchik
3cd706b107 MDEV-35236 Assertion `(mem_root->flags & 4) == 0' failed in safe_lexcstrdup_root
Post-fix for MDEV-35144.

Cannot allocate options values on the statement arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was initially empty, it will be reset for every execution
and any values allocated on the  statement arena will be lost.

Cannot allocate option values on the execution arena, because
HA_CREATE_INFO is shallow-copied for every execution, so if the
option_list was  initially NOT empty, any values appended to the
end will be preserved and if they're on the execution arena their
content will be destroyed.

Let's use thd->change_item_tree() to save and restore necessary pointers
for every execution.

followup for 3da565c41d
2024-10-23 14:58:57 +02:00
Sergei Golubchik
eac33a23da MDEV-32022 ERROR 1054 (42S22): Unknown column 'X' in 'NEW' in trigger
add missing do_get_copy/do_build_clone
2024-10-23 14:58:57 +02:00
Yuchen Pei
4b6922a315
MDEV-25008: UPDATE/DELETE: Cost-based choice IN->EXISTS vs Materialization
Single-table UPDATE/DELETE didn't provide outer_lookup_keys value for
subqueries. This didn't allow to make a meaningful choice between
IN->EXISTS and Materialization strategies for subqueries.

Fix this:
* Make UPDATE/DELETE save Sql_cmd_dml::scanned_rows,
* Then, subquery's JOIN::choose_subquery_plan() can fetch it from
there for outer_lookup_keys

Details:
UPDATE/DELETE now calls select_lex->optimize_unflattened_subqueries()
twice, like SELECT does (first call optimize_constant_subquries() in
JOIN::optimize_inner(), then call optimize_unflattened_subqueries() in
JOIN::optimize_stage2()):
1. Call with const_only=true before any optimizations. This allows
range optimizer and others to use the values of cheap const
subqueries.
2. Call it with const_only=false after range optimizer, partition
pruning, etc. outer_lookup_keys value is provided, so it's possible to
pick a good subquery strategy.

Note: PROTECT_STATEMENT_MEMROOT requires that first SP execution
performs subquery optimization for all subqueries, even for degenerate
query plans like "Impossible WHERE". Due to that, we ensure that the
call to optimize_unflattened_subqueries (with const_only=false) even
for degenerate query plans still happens, as was the case before this
change.
2024-10-23 23:51:24 +11:00
Oleg Smirnov
6bd1cb0ea0 MDEV-34880 Incorrect result for query with derived table having TEXT field
When a derived table which has distinct values and BLOB fields is
materialized, an index is created over all columns to ensure only
unique values are placed to the result.
This index is created in a special mode HA_UNIQUE_HASH to support BLOBs.
Later the optimizer may incorrectly choose this index to retrieve values
from the derived table, although such type of index cannot be used
for data retrieval.

This commit excludes HA_UNIQUE_HASH indexes from adding to
`JOIN::keyuse` array thus preventing their subsequent usage for
data retrieval
2024-10-23 17:55:00 +07:00
Vlad Lesin
8c7786e7d5 MDEV-34690 lock_rec_unlock_unmodified() causes deadlock
lock_rec_unlock_unmodified() is executed either under lock_sys.wr_lock()
or under a combination of lock_sys.rd_lock() + record locks hash table
cell latch. It also requests page latch to check if locked records were
changed by the current transaction or not.

Usually InnoDB requests page latch to find the certain record on the
page, and then requests lock_sys and/or record lock hash cell latch to
request record lock. lock_rec_unlock_unmodified() requests the latches
in the opposite order, what causes deadlocks. One of the possible
scenario for the deadlock is the following:

thread 1 - lock_rec_unlock_unmodified() is invoked under locks hash table
           cell latch, the latch is acquired;
thread 2 - purge thread acquires page latch and tries to remove
           delete-marked record, it invokes lock_update_delete(), which
           requests locks hash table cell latch, held by thread 1;
thread 1 - requests page latch, held by thread 2.

To fix it we need to release lock_sys.latch and/or lock hash cell latch,
acquire page latch and re-acquire lock_sys related latches.

When lock_sys.latch and/or lock hash cell latch are released in
lock_release_on_prepare() and lock_release_on_prepare_try(), the page on
which the current lock is held, can be merged. In this case the bitmap
of the current lock must be cleared, and the new lock must be added to
the end of trx->lock.trx_locks list, or bitmap of already existing lock
must be changed.

The new field trx_lock_t::set_nth_bit_calls indicates if new locks
(bits in existing lock bitmaps or new lock objects) were created during
the period when lock_sys was released in trx->lock.trx_locks list
iteration loop in lock_release_on_prepare() or
lock_release_on_prepare_try(). And, if so, we traverse the list again.

The block can be freed during pages merging, what causes assertion
failure in buf_page_get_gen(), as btr_block_get() passes BUF_GET as page
get mode to it. That's why page_get_mode parameter was added to
btr_block_get() to pass BUF_GET_POSSIBLY_FREED from
lock_release_on_prepare() and lock_release_on_prepare_try() to
buf_page_get_gen().

As searching for id of trx, which modified secondary index record, is
quite expensive operation, restrict its usage for master. System variable
was added to remove the restriction for testing simplifying. The
variable exists only either for debug build or for build with
-DINNODB_ENABLE_XAP_UNLOCK_UNMODIFIED_FOR_PRIMARY option to increase the
probability of catching bugs for release build with RQG.

Note that the code, which does primary index lookup to find out what
transaction modified secondary index record, is necessary only when
there is no primary key and no unique secondary key on replica with row
based replication, because only in this case extra X locks on unmodified
records can be set during scan phase.

Reviewed by Marko Mäkelä.
2024-10-23 12:36:17 +03:00
Vlad Lesin
92180ad513 MDEV-34466 XA prepare don't release unmodified records for some cases
There is no need to exclude exclusive non-gap locks from the procedure
of locks releasing on XA PREPARE execution in
lock_release_on_prepare_try() after commit
17e59ed3aa (MDEV-33454), because
lock_rec_unlock_unmodified() should check if the record was modified
with the XA, and release the lock if it was not.

lock_release_on_prepare_try(): don't skip X-locks, let
lock_rec_unlock_unmodified() to process them.

lock_sec_rec_some_has_impl(): add template parameter for not acquiring
trx_t::mutex for the case if a caller already holds the mutex, don't
crash if lock's bitmap is clean.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low(): add new argument
to skip trx_t::mutex acquiring.

rw_trx_hash_t::validate_element(): don't acquire trx_t::mutex if the
current thread already holds it.

Thanks to Andrei Elkin for finding the bug.
Reviewed by Marko Mäkelä, Debarun Banerjee.
2024-10-23 12:36:17 +03:00
Marko Mäkelä
1cad1dbde6 MDEV-35235 innodb_snapshot_isolation=ON fails to signal transaction rollback
convert_error_code_to_mysql(): Treat DB_DEADLOCK and DB_RECORD_CHANGED
in the same way, that is, signal to the SQL layer that the transaction
had been rolled back.
2024-10-23 07:55:22 +03:00
Jan Lindström
b3be3c2157 MDEV-30653 : With wsrep_mode=REPLICATE_ARIA only part of mixed-engine transactions is replicated
Replication of non-transactional engines is experimental and
uses TOI. This naturally means that if there is open transaction
with transactional engine it's changes will be rolled back.

Fixed by adding error message if non-transactional engine
is part of multi-engine transaction with warning.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-10-23 04:00:52 +02:00
Jan Lindström
7ffa7b6b01 MDEV-31888 : galera.galera_wan, galera.galera_vote_rejoin_* fail
Clean up configuration and tests. Add wait conditions to make
sure test continues from clean state.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-10-23 03:47:08 +02:00
Oleg Smirnov
fd87e01f38 MDEV-27277 Add a warning when max_sort_length is reached
During a query execution some sorting and grouping operations
on strings may be involved. System variable max_sort_length defines
the maximum number of bytes to use when comparing strings during
sorting/grouping. Thus, the comparable parts of strings may be less
than their actual size, so the results of the query may be not
sorted/grouped properly.
To indicate that some comparisons were done on a truncated lengths,
a new warning has been introduced with this commit.
2024-10-22 22:39:36 +07:00
Alexander Barkov
0d17c540a5 MDEV-27277 Add a warning when max_sort_length is reached
Step#1: fixing the return type of strnxfrm() from size_t to this structure:

typedef struct
{
  size_t m_output_length;
  size_t m_source_length_used;
  uint m_warnings;
} my_strnxfrm_ret_t;
2024-10-22 21:42:53 +07:00
Oleksandr Byelkin
9b3413c71f MDEV-8578: fix galera test 2024-10-22 09:23:56 +02:00
Oleksandr Byelkin
d29611afa1 MDEV-15497 fixed outdated syntax 2024-10-22 09:12:23 +02:00
Brandon Nesterenko
1ed30e08af MDEV-34122: Assertion `entry' failed in Active_tranx::assert_thd_is_waiter
If semi-sync is switched off then on while a transaction is
in-between binlogging and waiting for an ACK, the semi-sync state of
the transaction is removed, leading to a debug assertion that
indicates the transaction tried to wait, but cannot receive an ACK
signal. More specifically, when semi-sync is switched off, the
Active_tranx list is cleared (where a transaction adds an entry to
this list during binlogging), and each entry in this list saves the
thread which will wait for an ACK, and the thread has the COND
variable to signal to wake itself. So if the entry is lost, the
Ack_receiver thread won’t be able to find the thread to wake up when
an ACK comes in

The fix is to ensure that the entry exists before awaiting the ACK,
and if there is no entry, skip the wait. In debug builds, an
informative message is written explaining that the transaction is
skipping its wait. Additional debug-build only logic is added to
ensure that the cause of the missing entry is due to semi-sync being
turned off and on

Reviewed By:
============
Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-21 15:35:54 -06:00
Alexander Barkov
855c21eb99 Recording ctype_gbk_export_import.result according to MDEV-34883 2024-10-21 14:08:00 +04:00
Thirunarayanan Balathandayuthapani
7f7d78bc18 MDEV-35183 ADD FULLTEXT INDEX unnecessarily DROPS FTS COMMON TABLES
- InnoDB fulltext rebuilds the FTS COMMON table while adding the
new fulltext index. This can be optimized by avoiding rebuilding
the FTS COMMON table in case of FTS COMMON TABLE already exists.

Reviewed-by: Marko Mäkelä <marko.makela@mariadb.com>
2024-10-21 12:27:09 +05:30
Alexander Barkov
e1cd3c4033 MDEV-12252 ROW data type for stored function return values
Adding support for the ROW data type in the stored function RETURNS clause:

- explicit ROW(..members...) for both sql_mode=DEFAULT and sql_mode=ORACLE

  CREATE FUNCTION f1() RETURNS ROW(a INT, b VARCHAR(32)) ...

- anchored "ROW TYPE OF [db1.]table1" declarations for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS ROW TYPE OF test.t1 ...

- anchored "[db1.]table1%ROWTYPE" declarations for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1%ROWTYPE ...

Adding support for anchored scalar data types in RETURNS clause:

- "TYPE OF [db1.]table1.column1" for sql_mode=DEFAULT

  CREATE FUNCTION f1() RETURNS TYPE OF test.t1.column1;

- "[db1.]table1.column1" for sql_mode=ORACLE

  CREATE FUNCTION f1() RETURN test.t1.column1%TYPE;

Details:

- Adding a new sql_mode_t parameter to
    sp_head::create()
    sp_head::sp_head()
    sp_package::create()
    sp_package::sp_package()
  to guarantee early initialization of sp_head::m_sql_mode.
  Before this change, this member was not initialized at all during
  CREATE FUNCTION/PROCEDURE/PACKAGE statements, and was not used.
  Now it needs to be initialized to write properly the
  mysql.proc.returns column, according to the create time sql_mode.

- Code refactoring to make the things simpler and functions smaller:

  * Adding a new method
    Field_row::row_create_fields(THD *thd, List<Spvar_definition> *list)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit definition.

  * Adding a new method
    Field_row::row_create_fields(THD *thd, const Spvar_definition &def)
    to make a Virtual_tmp_table with Fields for ROW members
    from an explicit or a table anchored definition.

  * Adding a new method
    Item_args::add_array_of_item_field(THD *thd, const Virtual_tmp_table &vtable)
    to create and array of Item_field corresponding to all Field instances
    in a Virtual_tmp_table

  * Removing Item_field_row::row_create_items(). It was decomposed
    into the new methods described above.

  * Moving the code from the loop body in sp_rcontext::init_var_items()
    into a separate method Spvar_definition::make_item_field_row(),
    to make the code clearer (smaller functions).
    make_item_field_row() itself uses the new methods described above.

- Changing the data type of sp_head::m_return_field_def
  from Column_definition to Spvar_definition.
  So now it supports not only SQL column field types,
  but also explicit ROW and anchored ROW data types,
  as well as anchored column types.

- Adding a new Column_definition parameter to sp_head::create_result_field().
  Before this patch, create_result_field() took the definition only
  from m_return_field_def. Now it's also called with a local Column_definition
  variable which contains the explicit definition resolved from an
  anchored defition.

- Modifying sql_yacc.yy to support the new grammar.
  Adding new helper methods:
    * sf_return_fill_definition_row()
    * sf_return_fill_definition_rowtype_of()
    * sf_return_fill_definition_type_of()

- Fixing tests in:
  * Virtual_tmp_table::setup_field_pointers() in sql_select.cc
  * Send_field::normalize() in field.h
  * store_column_type()
  to prevent calling Type_handler_row::field_type(),
  which is implemented a DBUG_ASSERT(0).
  Before this patch the affected methods and functions were called only
  for scalar data types. Now ROW is also possible.

- Adding a new virtual method Field::cols()

- Overriding methods:
   Item_func_sp::cols()
   Item_func_sp::element_index()
   Item_func_sp::check_cols()
   Item_func_sp::bring_value()
  to support the ROW data type.

- Extending the rule sp_return_type to support
  * explicit ROW and anchored ROW data types
  * anchored scalar data types

- Overriding Field_row::sql_type() to print
  the data type of an explicit ROW.
2024-10-21 07:59:29 +04:00
Alexander Barkov
dfaf7e2eb4 MDEV-15751 CURRENT_TIMESTAMP should return a TIMESTAMP [WITH TIME ZONE?]
Changing the return type of the following functions:
  - CURRENT_TIMESTAMP, CURRENT_TIMESTAMP(), NOW()
  - SYSDATE()
  - FROM_UNIXTIME()
from DATETIME to TIMESTAMP.

Note, the old function NOW() returning DATETIME is still available
as LOCALTIMESTAMP or LOCALTIMESTAMP(), e.g.:

  SELECT
    LOCALTIMESTAMP,     -- DATETIME
    CURRENT_TIMESTAMP;  -- TIMESTAMP

The change in the functions return data type fixes some problems
that occurred near a DST change:

- Problem #1

INSERT INTO t1 (timestamp_field) VALUES (CURRENT_TIMESTAMP);
INSERT INTO t1 (timestamp_field) VALUES (COALESCE(CURRENT_TIMESTAMP));

could result into two different values inserted.

- Problem #2

INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526));
INSERT INTO t1 (timestamp_field) VALUES (FROM_UNIXTIME(1288477526+3600));

could result into two equal TIMESTAMP values near a DST change.

Additional changes:

- FROM_UNIXTIME(0) now returns SQL NULL instead of '1970-01-01 00:00:00'
  (assuming time_zone='+00:00')

- UNIX_TIMESTAMP('1970-01-01 00:00:00') now returns SQL NULL instead of 0
  (assuming time_zone='+00:00'

These additional changes are needed for consistency with TIMESTAMP fields,
which cannot store '1970-01-01 00:00:00 +00:00'
2024-10-19 22:48:23 +02:00
Sergei Golubchik
128fc34990 fix rdiff files in sys_var suite 2024-10-19 16:54:48 +02:00
Sergei Golubchik
15a291e4e0 MDEV-14978 fix client.client-env-variable test
* fix paths to work when installed and not only from the source dir
* don't use a cnf file (no need to restart the server for this)
* set MYSQL_HOST to a valid hostname when testing an invalid MARIADB_HOST
* use invalid ip to have clients fail quickly and not waste time
  on resolving the invalid hostname

followup for eedbb901e5
2024-10-19 16:53:16 +02:00
Kristian Nielsen
abc46259c6 MDEV-34753 memory pressure - erroneous termination condition
Fix race condition in test case by waiting for the expected state to occur.

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-19 17:20:27 +11:00
Daniel Black
eb29190398 MDEV-34753 memory pressure - erroneous termination condition
The 'if (!m_abort) break' condition was inverted by accident.

Constrain the test case to environments where there is cgroupv2
runtime environment which is the same case that will pass a memory
pressure initialization.

Remove the explicit garbage_collection trigger as it hides the abnormal
termination error on the event loop for memory pressure. This
also means there is no support in non-cgroupv2 environments
(possibly some container environments).

As the trigger to memory pressure is via a different thread we
need to wait until a "[mM]emory pressure" log message is there to
know it has succeeded or failed.

Thanks Kristian Nielsen for noticing and review.
2024-10-19 17:20:27 +11:00
Sergei Petrunia
a68e74b5a4 MDEV-35164: optimizer_join_limit_pref_ratio: assertion when the ORDER BY table becomes constant
Assertion failure has happened due to this scenario:

A query was ran with optimizer_join_limit_pref_ratio=1.
The query had "ORDER BY t1.col LIMIT N".
The optimizer set join->limit_shortcut_applicable=1.
Then, table t1 was marked as constant.
The code in choose_query_plan() still set join->limit_optimization_mode=1
which caused the optimizer to only consider t1 as the first non-const table.
But t1 was already put into the join prefix as the constant table.
The optimizer couldn't produce any join order at all and crashed.

Fixed by not searching for shortcut plan if ORDER BY table is a constant.
We will not try to do sorting anyway in this case (and LIMIT short-cutting
will be done for any join order).
2024-10-18 15:42:05 +03:00
Rucha Deodhar
e14d2b7974 MDEV-8578: Wrong error code/message with enforce_storage_engine and
NO_ENGINE_SUBSTITUTION

Analysis:
When the error is hit, wrong error code is passed in my_error
Fix:
Pass a better error code.
2024-10-18 16:42:52 +05:30
Sergei Petrunia
0540eac05c MDEV-35180: ref_to_range rewrite causes poor query plan
(Variant 2: only allow rewrite for ref(const))

make_join_select() has a "ref_to_range" rewrite: it would rewrite
any ref access to a range access on the same index if the latter uses
more keyparts.
It seems, he initial intent of this was to fix poor query plan choice
in cases like

  t.keypart1=const AND t.keypart2 < 'foo'

Due to deficiency in cost model, ref access could be picked while range
would enumerate fewer rows and be cheaper.
However, the condition also forces a rewrite in cases like:

  t.keypart1=prev_table.col AND t.keypart1<='foo' AND t.keypart2<'bar'

Here, it can be that
* keypart1=prev_table.col is highly selective
* (keypart1, keypart2) <= ('foo', 'bar') is not at all selective.

Still, the rewrite would be made and poor query plan chosen.
Fixed this by only doing the rewrite if ref access was ref(const)
so we can be certain that quick select also used these restrictions
and will scan a subset of rows that ref access would scan.
2024-10-18 13:37:04 +03:00
Marko Mäkelä
ebefef658e Merge 10.11 into 11.2 2024-10-18 11:32:22 +03:00
Marko Mäkelä
eca552a1a4 MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

In crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-18 10:12:47 +03:00
Brandon Nesterenko
e213e916ad MDEV-32014: Fix mysqld--help,win.rdiff 2024-10-17 15:53:00 -06:00
Sergei Golubchik
3a1cf2c85b MDEV-34679 ER_BAD_FIELD uses non-localizable substrings 2024-10-17 21:37:37 +02:00
Sergei Golubchik
3693fb9581 MDEV-25199 cluster fails to start up
if you need innodb in your test - enable it yourself
2024-10-17 21:37:37 +02:00
Sergei Golubchik
e1e836fc76 update results after the merge 2024-10-17 21:37:37 +02:00
Sergei Golubchik
3da565c41d MDEV-35144 CREATE TABLE ... LIKE uses current innodb_compression_default instead of the create value
When adding a column or index that uses plugin-defined
sysvar-based options with CREATE ... LIKE the server
was using the current value of the sysvar, not the default one.

Because parse_option_list() function was used both in create
and open and it tried to guess when it's create (need to use
current sysvar value and add a new name=value pair to the list)
or open (need to use default, without extending the list).

Let's move the list extending functionality into a separate
function and call it explicitly when needed. Operations that
add new objects (CREATE, ALTER ... ADD) will extend the list,
other operations (ALTER, CREATE ... LIKE, open) will not.
2024-10-17 16:28:39 +02:00
Marko Mäkelä
bb47e575de MDEV-34830: LSN in the future is not being treated as serious corruption
The invariant of write-ahead logging is that before any change to a
page is written to the data file, the corresponding log record must
must first have been durably written.

On crash recovery, there were some sloppy checks for this. Let us
implement accurate checks and flag an inconsistency as a hard error,
so that we can avoid further corruption of a corrupted database.
For data extraction from the corrupted database, innodb_force_recovery
can be used.

Before recovery is reading any data pages or invoking
buf_dblwr_t::recover() to recover torn pages from the
doublewrite buffer, InnoDB will have parsed the log until the
final LSN and updated log_sys.lsn to that. So, we can rely on
log_sys.lsn at all times. The doublewrite buffer recovery has been
refactored in such a way that the recv_sys.dblwr.pages may be consulted
while discovering files and their page sizes, but nothing will be
written back to data files before buf_dblwr_t::recover() is invoked.

A section of the test mariabackup.innodb_redo_overwrite
that is parsing some mariadb-backup --backup output has
been removed, because that output "redo log block is overwritten"
would often be missing in a Microsoft Windows environment
as a result of these changes.

recv_max_page_lsn, recv_lsn_checks_on: Remove.

recv_sys_t::validate_checkpoint(): Validate the write-ahead-logging
condition at the end of the recovery.

recv_dblwr_t::validate_page(): Keep track of the maximum LSN
(if we are checking a non-doublewrite copy of a page) but
do not complain LSN being in the future. The doublewrite buffer
is a special case, because it will be read early during recovery.
Besides, starting with commit 762bcb81b5
the dblwr=true copies of pages may legitimately be "too new".

recv_dblwr_t::find_page(): Find a valid page with the smallest
FIL_PAGE_LSN that is in the valid range for recovery.

recv_dblwr_t::restore_first_page(): Replaced by find_page().
Only buf_dblwr_t::recover() will write to data files.

buf_dblwr_t::recover(): Simplify the message output. Do attempt
doublewrite recovery on user page read error. Ignore doublewrite
pages whose FIL_PAGE_LSN is outside the usable bounds. Previously,
we could wrongly recover a too new page from the doublewrite buffer.
It is unlikely that this could have lead to an actual error.
Write back all recovered pages from the doublewrite buffer here,
including for the first page of any tablespace.

buf_page_is_corrupted(): Distinguish the return values
CORRUPTED_FUTURE_LSN and CORRUPTED_OTHER.

buf_page_check_corrupt(): Return the error code DB_CORRUPTION
in case the LSN is in the future.

Datafile::read_first_page(): Handle FSP_SPACE_FLAGS=0xffffffff
in the same way on both 32-bit and 64-bit architectures.

Datafile::read_first_page_flags(): Split from read_first_page().
Take a copy of the first page as a parameter.

recv_sys_t::free_corrupted_page(): Take the file as a parameter
and return whether a message was displayed. This avoids some duplicated
and incomplete error messages.

buf_page_t::read_complete(): Remove some redundant output and always
display the name of the corrupted file. Never return DB_FAIL;
use it only in internal error handling.

IORequest::read_complete(): Assume that buf_page_t::read_complete()
will have reported any error.

fil_space_t::set_corrupted(): Return whether this is the first time
the tablespace had been flagged as corrupted.

Datafile::validate_first_page(), fil_node_open_file_low(),
fil_node_open_file(), fil_space_t::read_page0(),
fil_node_t::read_page0(): Add a parameter for a copy of the
first page, and a parameter to indicate whether the FIL_PAGE_LSN
check should be suppressed. Before buf_dblwr_t::recover() is
invoked, we cannot validate the FIL_PAGE_LSN, but we can trust the
FSP_SPACE_FLAGS and the tablespace ID that may be present in a
potentially too new copy of a page.

Reviewed by: Debarun Banerjee
2024-10-17 17:24:20 +03:00
Brandon Nesterenko
39cce39ae1 MDEV-32014: typo fix in test 2024-10-17 07:54:09 -06:00
Sergei Golubchik
70aa713f58 MDEV-32014 test fix 2024-10-17 07:53:59 -06:00
Libing Song
72cc58bb71 MDEV-32014 Rename binlog cache temporary file to binlog file
for large transaction

Description
===========
When a transaction commits, it copies the binlog events from
binlog cache to binlog file. Very large transactions
(eg. gigabytes) can stall other transactions for a long time
because the data is copied while holding LOCK_log, which blocks
other commits from binlogging.

The solution in this patch is to rename the binlog cache file to
a binlog file instead of copy, if the commiting transaction has
large binlog cache. Rename is a very fast operation, it doesn't
block other transactions a long time.

Design
======
* binlog_large_commit_threshold
  type: ulonglong
  scope: global
  dynamic: yes
  default: 128MB

  Only the binlog cache temporary files large than 128MB are
  renamed to binlog file.

* #binlog_cache_files directory
  To support rename, all binlog cache temporary files are managed
  as normal files now. `#binlog_cache_files` directory is in the same
  directory with binlog files. It is created at server startup if it doesn't
  exist. Otherwise, all files in the directory is deleted at startup.

  The temporary files are named with ML_ prefix and the memorary address
  of the binlog_cache_data object which guarantees it is unique.

* Reserve space
  To supprot rename feature, It must reserve enough space at the
  begin of the binlog cache file. The space is required for
  Format description, Gtid list, checkpoint and Gtid events when
  renaming it to a binlog file.

  Since binlog_cache_data's cache_log is directly accessed by binlog log,
  online alter and wsrep. It is not easy to update all the code. Thus
  binlog cache will not reserve space if it is not session binlog cache or
  wsrep session is enabled.

  - m_file_reserved_bytes
    Stores the bytes reserved at the begin of the cache file.
    It is initialized in write_prepare() and cleared by reset().

    The reserved file header is hide to callers. Thus there is no
    change for callers. E.g.
    - get_byte_position() still get the length of binlog data
      written to the cache, but not the file length.
    - truncate(0) will truncate the file to m_file_reserved_bytes but not 0.

  - write_prepare()
    write_prepare() is called everytime when anything is being written
    into the cache. It will call init_file_reserved_bytes() to  create
    the cache file (if it doesn't exist) and reserve suitable space if
    the data written exceeds buffer's size.

* Binlog_commit_by_rotate
  It is used to encapsulate the code for remaing a binlog cache
  tempoary file to binlog file.
  - should_commit_by_rotate()
    it is called by write_transaction_to_binlog_events() to check if
    a binlog cache should be rename to a binlog file.
  - commit()
    That is the entry to rename a binlog cache and commit the
    transaction. Both rename and commit are protected by LOCK_log,
    Thus not other transactions can write anything into the renamed
    binlog before it.

    Rename happens in a rotation. After the new binlog file is generated,
    replace_binlog_file() is called to:
    - copy data from the new binlog file to its binlog cache file.
    - write gtid event.
    - rename the binlog cache file to binlog file.

    After that the rotation will continue to succeed. Then the transaction
    is committed in a seperated group itself. Its cache file will be
    detached and cache log will be reset before calling
    trx_group_commit_with_engines(). Thus only Xid event be written.
2024-10-17 07:53:59 -06:00
Oleksandr Byelkin
600c42ea86 MDEV-34883 LOAD DATA INFILE with geometry data fails
We write field using field data charset, so we should read it
using the field charset.
2024-10-17 10:33:36 +02:00
Sergei Golubchik
3b58c6b93f MDEV-35079 Migrate MySQL5.7 to MariaDB 10.4, then to MariaDB 10.11 Failed
correctly detect when partitioning is disabled
2024-10-17 10:08:24 +02:00
Sergei Golubchik
7842cab8c0 MDEV-34318 post-merge fix 2024-10-17 10:08:24 +02:00
Sergei Golubchik
6b436cba01 Revert "Fixes buildbot issue with plugin.fulltext_plugin"
This reverts commit a8010e7689.

The test doesn't require embedded after ab15628bbc
2024-10-17 09:11:47 +02:00
Thirunarayanan Balathandayuthapani
4a1ded61a4 MDEV-34529 Shrink the system tablespace when system tablespace contains MDEV-30671 leaked undo pages
- InnoDB fails to shrink the system tablespace when it contains
the leaked undo log pages caused by MDEV-30671.

- InnoDB does free the unused segment in system tablespace
before shrinking the tablespace. InnoDB fails to free
the unused segment if XA PREPARE transaction exist or
if the previous shutdown was not with innodb_fast_shutdown=0

inode_info: Structure to store the inode page and offsets.

fil_space_t::garbage_collect(): Frees the system tablespace
unused segment

fsp_get_sys_used_segment(): Iterates through all default
file segment and index segment present in system tablespace.

trx_sys_t::is_xa_exist(): Returns true if the XA transaction
exist in the undo logs

fseg_inode_free(): Frees the extents, fragment pages for the
given index node and ignores any error similar to
trx_purge_free_segment()

trx_sys_t::reset_page(): Retain the TRX_SYS_FSEG_HEADER value
in trx_sys page while resetting the page.
2024-10-16 21:34:24 +05:30
Monty
4955f6018a MDEV-29351 SIGSEGV when doing forward reference of item in select list
The reason for the crash was the code assumed that
SELECT_LEX.ref_pointer_array would be initialized with zero, which was
not the case. This cause the test of
if (!select->ref_pointer_array[counter]) in item.cc to be unpredictable
and causes crashes.

Fixed by zero-filling ref_pointer_array on allocation.
2024-10-16 17:24:46 +03:00
Monty
0de2613e7a Fixed that SHOW CREATE TABLE for sequences shows used table options 2024-10-16 17:24:46 +03:00
Monty
2c52fdd28a MDEV-32350 Can't selectively restore sequences using innodb tables from backup
Added support for sequences to do  discard and import tablespace
2024-10-16 17:24:46 +03:00
Monty
ee908140ac Fixed bug in main.connect test where Connection_errors showed wrong value 2024-10-16 17:24:46 +03:00
Monty
a8010e7689 Fixes buildbot issue with plugin.fulltext_plugin
The test is using features not in the embedded server.
Fixed by including not_embedded.inc
2024-10-16 17:24:46 +03:00
Vladislav Vaintroub
c1fc59277a MDEV-34929 page-compressed tables do not work on Windows
Remove workaround for MDEV-13941, it served for 5 years,and all affected
pre-release 10.2 installation should have been already fixed in between.

Apparently Innodb is using is_sparse parameter in os_file_set_size()
inconsistently, and it passes is_sparse=false now during first file
extension. With MDEV-13941 workaround in place, it would unsparse
the file, which is makes compression not to work at all anymore.
2024-10-16 16:02:13 +02:00
Sergei Petrunia
89493a9980 MDEV-34993: fix merge into 10.6: OPTIMIZER_ADJ_FIX_CARD_MULT should be ON by default 2024-10-16 16:46:38 +03:00
Sergei Golubchik
5ebda30ccc Revert "MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2"
This reverts commit 8ae462a220.
2024-10-16 13:23:47 +02:00
Kristian Nielsen
8ae462a220 MDEV-35019 Provide a way to enable "rollback XA on disconnect" behavior we had before 10.5.2
Implement variable legacy_xa_rollback_at_disconnect to support
backwards compatibility for applications that rely on the pre-10.5
behavior for connection disconnect, which is to rollback the
transaction (in violation of the XA specification).

Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-16 10:18:36 +02:00
Sergei Petrunia
9849e3f948 MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select
Pre-11.0 variant:

1. In recompute_join_cost_with_limit(), add an assertion that that
partial_join_cost >= 0.0.

2. best_extension_by_limited_search() subtracts COST_EPS from
   join->best_read. But it is not subtracted from
   join->positions[0].read_time, add it back.

2. We could get very small negative partial_join_cost due to rounding
  errors. For fraction=1.0, we were computing essentially this (denote
  as EXPR-1):

  $row_read_cost + $where_cost - ($row_read_cost + $where_cost)

which should compute to 0.
But the computation was done in the following order (left-to-right):
EXPR-2:

  ($row_read_cost + $where_cost) - $row_read_cost - $where_cost

this produced a value of -1.1102230246251565e-16 due to a rounding
error. Change the computation use EXPR-1 instead of EXPR-2.
2024-10-15 15:56:41 +03:00
Sergei Petrunia
66b8d32b75 MDEV-35072: Assertion with optimizer_join_limit_pref_ratio and 1-table select
Variant for 11.2+:

In recompute_join_cost_with_limit(), do not subtract the cost of checking
the WHERE:
   pos->records_read* WHERE_COST_THD(join->thd)

It is already included in pos->read_time.

Also added comments about difference between this fix and the pre-11.2
variant.
2024-10-15 15:01:29 +03:00
Oleksandr Byelkin
7038382e1b Fix grant5.test with view protocol 2024-10-15 12:44:24 +02:00
Oleksandr Byelkin
44804c667e MDEV-18151 pam test result fix
Fix result due to the new error message.
2024-10-15 12:44:24 +02:00
Oleksandr Byelkin
f27b69a447 fix MDEV-26459 test on 32bit systems.
(where it test nothing but produce warnngs)
2024-10-15 09:25:05 +02:00
Thirunarayanan Balathandayuthapani
6aaae4c03b MDEV-35122 Incorrect NULL value handling for instantly dropped BLOB columns
Problem:
=======
- Redundant table fails to insert into the table after
instant drop blob column. Instant drop column only marking
the column as hidden and consecutive insert statement tries
to insert NULL value for the dropped BLOB column and returns
the fixed length of the blob type as 65535. This lead to
row size too large error.

Fix:
====
 For redundant table, if the non-fixed dropped column can be null
then set the length of the field type as 0.
2024-10-15 12:04:37 +05:30
Yuchen Pei
cd5577ba4a
Merge branch '10.5' into 10.6 2024-10-15 16:00:44 +11:00
Yuchen Pei
35cebfdc51
MDEV-15696 Implement SHOW CREATE SERVER
One change is that if the port is not supplied or out of bound, the
old behaviour is to print 3306. The new behaviour is to not print
it (if not supplied) or the out of bound value.
2024-10-15 10:50:23 +11:00
Yuchen Pei
d2eba35653
MDEV-34716 Allow arbitrary options in CREATE SERVER
The existing syntax for CREATE SERVER

CREATE [OR REPLACE] SERVER [IF NOT EXISTS] server_name
    FOREIGN DATA WRAPPER wrapper_name
    OPTIONS (option [, option] ...)

option:
  { HOST character-literal
  | DATABASE character-literal
  | USER character-literal
  | PASSWORD character-literal
  | SOCKET character-literal
  | OWNER character-literal
  | PORT numeric-literal }

With this change we have:

option:
  { HOST character-literal
  | DATABASE character-literal
  | USER character-literal
  | PASSWORD character-literal
  | SOCKET character-literal
  | OWNER character-literal
  | PORT numeric-literal
  | PORT quoted-numerical-literal
  | identifier character-literal}

We store these options as a JSON field in the mysql.servers system
table. We retain the restriction that PORT needs to be a number, but
also allow it to be a quoted number, so that SHOW CREATE SERVER can be
used for dumping. Without an accompanied implementation of SHOW CREATE
SERVER, some mysqldump tests will fail. Therefore this commit should
be immediately followed by the one implementating SHOW CREATE SERVER,
with testing covering both.
2024-10-15 10:50:22 +11:00
Yuchen Pei
2345407b8c
MDEV-34716 Fix mysql.servers socket max length too short
The limit of socket length on unix according to libc is 108, see
sockaddr_un::sun_path, but in the table it is a string of max length
64, which results in truncation of socket and failure to connect by
plugins using servers such as spider.
2024-10-15 10:50:22 +11:00
Rex
9315452ea0 MDEV-34941 MDEV-31466-fix column count issue with union in derived table
In specifying a derived table with a union, for example

CREATE TABLE t (c1 INT KEY,c2 INT,c3 INT) ENGINE=MyISAM;
SELECT * FROM (SELECT * FROM t UNION SELECT * FROM t) AS d (d1,d2);

we bypass an earlier check for the correct number of specified column
names, causing a crash.

Fixed by adding a check for the correct number of supplied arguments
in st_select_lex_unit::rename_types_list()
2024-10-15 06:50:19 +12:00
Rex
e90aab7acc MDEV-34931 MDEV-31466 name resolution fails in --view
Fix for MDEV-31466 - add optional derived table column names.
Column names within a SELECT_LEX structure can be left in a non-reparsable
state (as printed out from *::print) after JOIN::prepare.  This caused
an incorrect view definition to be written into the .FRM file.
Fixed by resetting item list names in SELECT_LEX structures representing
derived tables before writing out the view definition.

Reviewed by Igor Babaev (igor@mariadb.com)
2024-10-15 06:08:46 +12:00
Rex
10008b3d3e MDEV-31466 Add optional correlation column list for derived tables
Extend derived table syntax to support column name assignment.
(subquery expression) [as|=] ident [comma separated column name list].
Prior to this patch, the optional comma separated column name list is
not supported.

Processing within the unit of the subquery expression will use
original column names, outside the unit will use the new names.

For example, in the query

select a1, a2 from
  (select c1, c2, c3 from t1 where c2 > 0) as dt (a1, a2, a3)
where a2 > 10;

we see the second column of the derived table dt being used both within,
(where c2 > 0), and outside, (where a2 > 10), the specification.
Both conditions apply to t1.c2.

When multiple unit preparations are required, such as when being used within
a prepared statement or procedure, original column names are needed for
correct resolution. Original names are reset within mysql_derived_reinit().

Item_holder items, used for result tables in both TVC and union preparations
are renamed before use within st_select_lex_unit::prepare().

During wildcard expansion, if column names are present, items names are
set directly after creation.

Reviewed by Igor Babaev (igor@mariadb.com)
2024-10-15 06:08:46 +12:00
Thirunarayanan Balathandayuthapani
5777d9f282 MDEV-35116 InnoDB fails to set error index for HA_ERR_NULL_IN_SPATIAL
- InnoDB fails to set the index information or index number
for the spatial index error HA_ERR_NULL_IN_SPATIAL.

row_build_spatial_index_key(): Initialize the tmp_mbr array completely.

check_if_supported_inplace_alter(): Fix the spelling mistake of alter
2024-10-14 14:28:24 +05:30
Oleksandr Byelkin
a79c9b3812 MDEV-35135 Assertion `!is_cond()' failed in Item_bool_func::val_int / do_select
Change val_int with val_bool when it is a condition.
2024-10-14 09:36:17 +02:00
Marko Mäkelä
4e1e9ea6f3 MDEV-35124 Set innodb_snapshot_isolation=ON by default
From the very beginning, the default InnoDB transaction isolation level
REPEATABLE READ does not correspond to any well formed definition.
The main issue is the lack of write/write conflict detection.
To fix that and to make REPEATABLE READ correspond to Snapshot Isolation,
b8a6719889 introduced the Boolean
session variable innodb_snapshot_isolation. It was disabled by default
in order not to break any user applications.

In a new major version of MariaDB Server, we had better enable this
parameter by default.
2024-10-11 15:02:31 +03:00
Monty
69686375a8 MDEV-34782 SIGSEGV in handler::update_global_table_stats in close_thread_table()
Handler statistics did not take into account that it could not be fully
initialized in the table.
2024-10-09 17:22:48 +03:00
Oleksandr Byelkin
b138f428ea MDEV-18151 Skipped error returning for GRANT/SET PASSWORD
Make message of error not warning.
2024-10-09 15:48:14 +02:00
Oleksandr Byelkin
cc59fbfffa MDEV-18151 Skipped error returning for GRANT/SET PASSWORD
Make error issueing for GRANT and SET PASSWORD the same.
Report errors wich were skipped before.
2024-10-09 13:29:59 +02:00
Oleksandr Byelkin
9cd9c3cf16 fix grant5 test to return to the original database. 2024-10-09 13:29:46 +02:00
Marko Mäkelä
8981ee238a MDEV-33106 innodb.innodb-lock-inherit-read_commited times out
Sometimes, in MariaDB Server 10.5 but apparently not in later branches,
the test would hang because con1 and con2 would be blocked in
debug_sync (for example, lock_wait_suspend_thread_enter and
row_ins_sec_index_entry_dup_locks_created) and therefore blocking
the purge of transactions from completing.

To prevent an occasional DEBUG_SYNC induced hang in the test, we will
wait for everything to be purged, except the last 2 transactions.

This change should be null-merged to 10.6, because the test is not
failing in 10.6 or later major versions.
2024-10-09 11:26:12 +03:00
Oleksandr Byelkin
1d0e94c55f Merge branch '10.5' into 10.6 2024-10-09 08:38:48 +02:00
Sergei Golubchik
4281e0068b MDEV-35082 HANDLER with FULLTEXT keys is not always rejected
need to check for index capabilities also for HANDLER READ NEXT
2024-10-08 18:20:13 +02:00
Sergei Golubchik
23d7dc1ea1 MDEV-35086 Trying to lock mutex when the mutex was already locked (session_tracker.cc), server hangs
followup for 9021f40b8e
2024-10-08 15:31:02 +02:00
Sergei Golubchik
3ea71a2c8e MDEV-16699 heap-use-after-free in group_concat with compressed or GIS columns
Field_blob::store() has special code for GROUP_CONCAT temporary table
(to store blob values in Blob_mem_storage - this prevents them
from being freed/overwritten when a next row is read).

Field_geom and Field_blob_compressed inherit from Field_blob but they
have their own ::store() method without this special Blob_mem_storage
support.

Considering that non-grouping CONCAT() of such fields converts
them to plain BLOB, let's do the same for GROUP_CONCAT. To do it,
Item_func_group_concat::setup will signal that it's creating
a temporary table for GROUP_CONCAT, and Field_blog::make_new_field()
override will create base Field_blob when under group concat.
2024-10-08 15:31:02 +02:00
Aleksey Midenkov
706a8bcb5b MDEV-33470 Unique hash index is broken on DML for system-versioned table
Hash index is vcol-based wrapper (MDEV-371). row_end is added to
unique index. So when row_end is updated unique hash index must be
recalculated via vcol_update_fields(). DELETE did not update virtual
fields, so DELETE HISTORY was getting wrong hash value.

The fix does update_virtual_fields() on vers_update_end() so in every
case row_end is updated virtual fields are updated as well.
2024-10-08 13:08:10 +03:00
Aleksey Midenkov
d37bb140b1 MDEV-31297 Create table as select on system versioned tables do not
work consistently on replication

Row-based replication does not execute CREATE .. SELECT but instead
CREATE TABLE. CREATE .. SELECT creates implict system fields on
unusual place: in-between declared fields and select fields. That was
done because select_field_pos logic requires select fields go last in
create_list.

So, CREATE .. SELECT on master and CREATE TABLE on slave create system
fields on different positions and replication gets field mismatch.

To fix this we've changed CREATE .. SELECT to create implicit system
fields on usual place in the end and updated select_field_pos for
handling this case.
2024-10-08 13:08:10 +03:00
Alexander Barkov
a931da82fa MDEV-34123 CONCAT Function Returns Unexpected Empty Set in Query
Search conditions were evaluated using val_int(), which was wrong.
Fixing the code to use val_bool() instead.

Details:
- Adding a new item_base_t::IS_COND flag which marks Items used
  as <search condition> in WHERE, HAVING, JOIN ON, CASE WHEN clauses.
  The flag is at the parse time.
  These expressions must be evaluated using val_bool() rather than val_int().

  Note, the optimizer creates more Items which are used as search conditions.
  Most of these items are not marked with IS_COND yet. This is OK for now,
  but eventually these Items can also be fixed to have the flag.

- Adding a method Item::is_cond() which tests if the Item has the IS_COND flag.

- Implementing Item_cache_bool. It evaluates the cached expression using
  val_bool() rather than val_int().
  Overriding Type_handler_bool::Item_get_cache() to create Item_cache_bool.

- Implementing Item::save_bool_in_field(). It uses val_bool() rather than
  val_int() to evaluate the expression.

- Implementing Type_handler_bool::Item_save_in_field()
  using Item::save_bool_in_field().

- Fixing all Item_bool_func descendants to implement a virtual val_bool()
  rather than a virtual val_int().

- To find places where val_int() should be fixed to val_bool(), a few
  DBUG_ASSERT(!is_cond()) where added into val_int() implementations
  of selected (most frequent) classes:

  Item_field
  Item_str_func
  Item_datefunc
  Item_timefunc
  Item_datetimefunc
  Item_cache_bool
  Item_bool_func
  Item_func_hybrid_field_type
  Item_basic_constant descendants

- Fixing all places where DBUG_ASSERT() happened during an "mtr" run
  to use val_bool() instead of val_int().
2024-10-08 11:58:46 +02:00
Christian Gonzalez
fd0cc2b1fd Make SESSION_USER() comparable with CURRENT_USER()
Update `SESSION_USER()` behaviour to be comparable with `CURRENT_USER()`.
`SESSION_USER()` will return the user and host columns from `mysql.user`
used to authenticate the user when the session was created.

Historically `SESSION_USER()` was an alias of `USER()` function. The
main difference with `USER()` behaviour after this changes is that
`SESSION_USER()` now returns the host column from `mysql.user` instead of
the client host or ip.

NOTE: `SESSION_USER_IS_USER` old mode is added to make the change
backward compatible.

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer
Amazon Web Services, Inc.
2024-10-04 13:22:40 +02:00
Sergei Petrunia
5673cbe094 MDEV-35074: selectivity_notembedded fails with --view-protocol
Make the test view-protocol proof: save the contents of optimizer_trace
and then we can do many queries against it.

Also removed end-of-line spaces.
2024-10-04 12:49:37 +03:00
Ocean Li
eedbb901e5 [MDEV-14978] Client programs to use $MARIADB_HOST consistently
Only `mysql` client program was using $MYSQL_HOST as the default host.
Add the same feature in most other client programs but using
$MARIADB_HOST instead.

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
2024-10-04 06:44:39 +01:00
Yuchen Pei
e9c999caf4
MDEV-34915 Sort output of session track system variable in mysqltest
Updated the output format too.

Before:

-- Tracker : SESSION_TRACK_SYSTEM_VARIABLES
-- autocommit
-- ON
-- time_zone
-- SYSTEM
-- character_set_client
-- latin1
-- character_set_connection
-- latin1
-- redirect_url
--

After:

-- Tracker : SESSION_TRACK_SYSTEM_VARIABLES
-- autocommit: ON
-- character_set_client: latin1
-- character_set_connection: latin1
-- redirect_url:
-- time_zone: SYSTEM
2024-10-04 10:32:02 +10:00
sts-kokseng.wong
383d1f90dd The revision corresponds to the review comments. 1. Move the unit tests into the compat/oracle suite, sp-param.test file. 2. Remove the added unit test file and result file. 3. Add type, Alter_info::enum_alter_table_algorithm, into the union. 4. Remove the extra switch case 2024-10-04 00:17:37 +02:00
sts-kokseng.wong
43825af101 MDEV-34316 sql_mode=ORACLE: Ignore the NOCOPY keyword in stored routine parameters
During sql_mode=ORACLE, ignore the NOCOPY keyword in stored routine
parameters. The optimization (pass-by-reference instead of
pass-by-value) helping to avoid value copying will be done in a separate
task when needed.
2024-10-04 00:17:37 +02:00
Marko Mäkelä
f493e46494 Merge 11.6 into 11.7 2024-10-03 18:15:13 +03:00
Marko Mäkelä
590718cadc MDEV-31221 after-merge fix for test
The 10.5 test case was accidentally added after 10.11 ones.
Fixes up 63913ce5af
2024-10-03 17:01:29 +03:00
Marko Mäkelä
43465352b9 Merge 11.4 into 11.6 2024-10-03 16:09:56 +03:00
Marko Mäkelä
b53b81e937 Merge 11.2 into 11.4 2024-10-03 14:32:14 +03:00
Monty
6f6c1911dc MDEV-34251 Conditional jump or move depends on uninitialised value in ha_handler_stats::has_stats
Fixed by checking handler_stats if it's active instead of
thd->variables.log_slow_verbosity & LOG_SLOW_VERBOSITY_ENGINE.

Reviewed-by: Sergei Petrunia <sergey@mariadb.com>
2024-10-03 13:45:26 +03:00
Marko Mäkelä
12a91b57e2 Merge 10.11 into 11.2 2024-10-03 13:24:43 +03:00
Sergei Golubchik
ab15628bbc MDEV-35050 fix for embedded 2024-10-03 10:09:24 +02:00
Marko Mäkelä
63913ce5af Merge 10.6 into 10.11 2024-10-03 10:55:08 +03:00
Marko Mäkelä
c6e4ea682c Merge 10.5 into 10.6 2024-10-03 10:42:58 +03:00
Marko Mäkelä
6878c9d591 MDEV-35050 fixup: ./mtr --embedded 2024-10-03 10:40:58 +03:00
Marko Mäkelä
7e0afb1c73 Merge 10.5 into 10.6 2024-10-03 09:31:39 +03:00
Yuchen Pei
ba7088d462
Merge '11.4' into 11.6 2024-10-03 15:59:20 +10:00
Marko Mäkelä
cc70ca7eab MDEV-35059 ALTER TABLE...IMPORT TABLESPACE with FULLTEXT SEARCH may corrupt the adaptive hash index
build_fts_hidden_table(): Correct a mistake that had been made in
commit 903ae30069 (MDEV-30655).
2024-10-02 11:09:31 +03:00
Kristian Nielsen
90f090f22c Fix binlog.binlog_mdev25611 test failure on non-debug build
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-10-02 09:59:53 +02:00
Sergei Petrunia
1cda4726ca MDEV-34993, part2: backport optimizer_adjust_secondary_key_costs
...and make the fix for MDEV-34993 switchable. It is enabled by default
and controlled with @optimizer_adjust_secondary_key_costs=fix_card_multiplier
2024-10-02 10:52:09 +03:00
Sergei Petrunia
8166a5d33d MDEV-34993: Incorrect cardinality estimation causes poor query plan
When calculate_cond_selectivity_for_table() takes into account multi-
column selectivities from range access, it tries to take-into account
that selectivity for some columns may have been already taken into account.

For example, for range access on IDX1 using {kp1, kp2}, the selectivity
of restrictions on "kp2" might have already been taken into account
to some extent.
So, the code tries to "discount" that using rec_per_key[] estimates.

This seems to be wrong and unreliable: the "discounting" may produce a
rselectivity_multiplier number that hints that the overall selectivity
of range access on IDX1 was greater than 1.

Do a conservative fix: if we arrive at conclusion that selectivity of
range access on condition in IDX1 >1.0, clip it down to 1.
2024-10-02 10:52:09 +03:00
Sergei Golubchik
9021f40b8e MDEV-35050 Found wrong usage of mutex upon setting plugin session variables 2024-10-01 18:29:11 +02:00
Sergei Golubchik
5bf543fd43 MDEV-24193 UBSAN: sql/sql_acl.cc:9985:29: runtime error: member access within null pointer of type 'struct TABLE' , ASAN: use-after-poison in handle_grant_table
privilege tables do not always have to exist
2024-10-01 18:29:11 +02:00
Marko Mäkelä
464055fe65 MDEV-34078 Memory leak in InnoDB purge with 32-column PRIMARY KEY
row_purge_reset_trx_id(): Reserve large enough offsets for accomodating
the maximum width PRIMARY KEY followed by DB_TRX_ID,DB_ROLL_PTR.

Reviewed by: Thirunarayanan Balathandayuthapani
2024-10-01 18:35:39 +03:00
Oleksandr Byelkin
8d810e9426 MDEV-29537 Creation of view with UNION and SELECT ... FOR UPDATE in definition is failed with error
lock_type is writen in the last SELECT of the unit even if it parsed last,
so it should be printed last from the last select of the unit.
2024-10-01 11:07:45 +02:00
Rucha Deodhar
753e7d6d7c MDEV-27412: JSON_TABLE doesn't properly unquote strings
Analysis:
The value gets appended as string instead of unescaped json value

Fix:
Append the value of json in a temporary string and then store it in the
field instead of directly storing as string.
2024-10-01 13:45:46 +05:30
Thirunarayanan Balathandayuthapani
cc810e64d4 MDEV-34392 Inplace algorithm violates the foreign key constraint
Don't allow the referencing key column from NULL TO NOT NULL
when

 1) Foreign key constraint type is ON UPDATE SET NULL
 2) Foreign key constraint type is ON DELETE SET NULL
 3) Foreign key constraint type is UPDATE CASCADE and referenced
 column declared as NULL

Don't allow the referenced key column from NOT NULL to NULL
when foreign key constraint type is UPDATE CASCADE
and referencing key columns doesn't allow NULL values

get_foreign_key_info(): InnoDB sends the information about
nullability of the foreign key fields and referenced key fields.

fk_check_column_changes(): Enforce the above rules for COPY
algorithm

innobase_check_foreign_drop_col(): Checks whether the dropped
column exists in existing foreign key relation

innobase_check_foreign_low() : Enforce the above rules for
INPLACE algorithm

dict_foreign_t::check_fk_constraint_valid(): This is used
by CREATE TABLE statement to check nullability for foreign
key relation.
2024-10-01 09:41:56 +05:30
Oleksandr Byelkin
da322f19c6 MDEV-26459 Assertion `block_size <= 0xFFFFFFFFL' failed in calculate_block_sizes for 10.7 only
Limit default allocation block in tree of Unique class
2024-09-30 15:18:00 +02:00
Marko Mäkelä
d28ac3f82d MDEV-34207: ALTER TABLE...STATS_PERSISTENT=0 fails to drop statistics
commit_try_norebuild(): If the STATS_PERSISTENT attribute of the table
is being changed to disabled, drop the persistent statistics of the table.
2024-09-30 15:27:38 +03:00
Sergei Golubchik
b88f1267e4 MDEV-33373 part 2: Unexpected ER_FILE_NOT_FOUND upon reading from logging table after crash recovery
CSV engine shoud set my_errno if use it.
2024-09-30 13:50:51 +02:00
Julius Goryavsky
95d285fb75 MDEV-30307 addendum: support for compilation in release mode 2024-09-30 00:33:23 +02:00
sjaakola
cf0c3ec274 MDEV-30307 KILL command inside a transaction causes problem for galera replication
Added new test scenario in galera.galera_bf_kill
test to make the issue surface. The tetst scenario has
a multi statement transaction containing a KILL command.
When the KILL is submitted, another transaction is
replicated, which causes BF abort for the KILL command
processing. Handling BF abort rollback while executing
KILL command causes node hanging, in this scenario.

sql_kill() and sql_kill_user() functions have now fix,
to perform implicit commit before starting the KILL command
execution. BEcause of the implicit commit, the KILL execution
will not happen inside transaction context anymore.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-27 19:26:26 +02:00
Kristian Nielsen
35c732cdde MDEV-25611: RESET MASTER causes the server to hang
RESET MASTER waits for storage engines to reply to a binlog checkpoint
requests. If this response is delayed for a long time for some reason, then
RESET MASTER can hang.

Fix this by forcing a log sync in all engines just before waiting for the
checkpoint reply.

(Waiting for old checkpoint responses is needed to preserve durability of
any commits that were synced to disk in the to-be-deleted binlog but not yet
synced in the engine.)

Reviewed-by: Andrei Elkin <andrei.elkin@mariadb.com>
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>
2024-09-27 15:10:06 +02:00
Sergei Petrunia
d0a6a7886b MDEV-25822 JSON_TABLE: default values should allow non-string literals
(Polished initial patch by Alexey Botchkov)
Make the code handle DEFAULT values of any datatype

- Make Json_table_column::On_response::m_default be Item*, not LEX_STRING.
- Change the parser to use string literal non-terminals for producing
  the DEFAULT value
-- Also, stop updating json_table->m_text_literal_cs for the DEFAULT
   value literals as it is not used.
2024-09-27 13:52:17 +03:00
Marko Mäkelä
6acada713a MDEV-34062: Implement innodb_log_file_mmap on 64-bit systems
When using the default innodb_log_buffer_size=2m, mariadb-backup --backup
would spend a lot of time re-reading and re-parsing the log. For reads,
it would be beneficial to memory-map the entire ib_logfile0 to the
address space (typically 48 bits or 256 TiB) and read it from there,
both during --backup and --prepare.

We will introduce the Boolean read-only parameter innodb_log_file_mmap
that will be OFF by default on most platforms, to avoid aggressive
read-ahead of the entire ib_logfile0 in when only a tiny portion would be
accessed. On Linux and FreeBSD the default is innodb_log_file_mmap=ON,
because those platforms define a specific mmap(2) option for enabling
such read-ahead and therefore it can be assumed that the default would
be on-demand paging. This parameter will only have impact on the initial
InnoDB startup and recovery. Any writes to the log will use regular I/O,
except when the ib_logfile0 is stored in a specially configured file system
that is backed by persistent memory (Linux "mount -o dax").

We also experimented with allowing writes of the ib_logfile0 via a
memory mapping and decided against it. A fundamental problem would be
unnecessary read-before-write in case of a major page fault, that is,
when a new, not yet cached, virtual memory page in the circular
ib_logfile0 is being written to. There appears to be no way to tell
the operating system that we do not care about the previous contents of
the page, or that the page fault handler should just zero it out.

Many references to HAVE_PMEM have been replaced with references to
HAVE_INNODB_MMAP.

The predicate log_sys.is_pmem() has been replaced with
log_sys.is_mmap() && !log_sys.is_opened().

Memory-mapped regular files differ from MAP_SYNC (PMEM) mappings in the
way that an open file handle to ib_logfile0 will be retained. In both
code paths, log_sys.is_mmap() will hold. Holding a file handle open will
allow log_t::clear_mmap() to disable the interface with fewer operations.

It should be noted that ever since
commit 685d958e38 (MDEV-14425)
most 64-bit Linux platforms on our CI platforms
(s390x a.k.a. IBM System Z being a notable exception) read and write
/dev/shm/*/ib_logfile0 via a memory mapping, pretending that it is
persistent memory (mount -o dax). So, the memory mapping based log
parsing that this change is enabling by default on Linux and FreeBSD
has already been extensively tested on Linux.

::log_mmap(): If a log cannot be opened as PMEM and the desired access
is read-only, try to open a read-only memory mapping.

xtrabackup_copy_mmap_snippet(), xtrabackup_copy_mmap_logfile():
Copy the InnoDB log in mariadb-backup --backup from a memory
mapped file.
2024-09-26 18:47:12 +03:00
Tony Chen
be164fc401 ssl_cipher parameter cannot configure TLSv1.3 and TLSv1.2 ciphers at the same time
SSL_CTX_set_ciphersuites() sets the TLSv1.3 cipher suites.

SSL_CTX_set_cipher_list() sets the ciphers for TLSv1.2 and below.

The current TLS configuration logic will not perform SSL_CTX_set_cipher_list()
to configure TLSv1.2 ciphers if the call to SSL_CTX_set_ciphersuites() was
successful. The call to SSL_CTX_set_ciphersuites() is successful if any TLSv1.3
cipher suite is passed into `--ssl-cipher`.

This is a potential security vulnerability because users trying to restrict
specific secure ciphers for TLSv1.3 and TLSv1.2, would unknowingly still have
the database support insecure TLSv1.2 ciphers.

For example:
If setting `--ssl_cipher=TLS_AES_128_GCM_SHA256:ECDHE-RSA-AES128-GCM-SHA256`,
the database would still support all possible TLSv1.2 ciphers rather than only
ECDHE-RSA-AES128-GCM-SHA256.

The solution is to execute both SSL_CTX_set_ciphersuites() and
SSL_CTX_set_cipher_list() even if the first call succeeds.

This allows the configuration of exactly which TLSv1.3 and TLSv1.2 ciphers to
support.

Note that there is 1 behavior change with this. When specifying only TLSv1.3
ciphers to `--ssl-cipher`, the database will not support any TLSv1.2 cipher.
However, this does not impose a security risk and considering TLSv1.3 is the
modern protocol, this behavior should be fine.

All TLSv1.3 ciphers are still supported if only TLSv1.2 ciphers are specified
through `--ssl-cipher`.

All new code of the whole pull request, including one or several files that are
either new files or modified ones, are contributed under the BSD-new license. I
am contributing on behalf of my employer Amazon Web Services, Inc.
2024-09-26 11:50:20 +02:00
Jan Lindström
024e95128b MDEV-32996 : galera.galera_var_ignore_apply_errors -> [ERROR] WSREP: Inconsistency detected
Add wait_until_ready waits after wsrep_on is set on again to
make sure that node is ready for next step before continuing.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-26 00:04:56 +02:00
Jan Lindström
0ce5603b86 MDEV-33035 : Galera test case MDEV-16509 unstable
Stabilize test by reseting DEBUG_SYNC and add wait_condition
for expected table contents.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-26 00:04:56 +02:00
Dave Gosselin
f7c5182b7c MDEV-31636 Memory leak in Sys_var_gtid_binlog_state::do_check()
Move memory allocations performed during Sys_var_gtid_binlog_state::do_check
to Sys_var_gtid_binlog_state::global_update where they will be freed before
the latter method returns.
2024-09-25 14:17:21 -04:00
Lena Startseva
ad5b9c207c MDEV-27944: View-protocol fails if database was changed
This is a limitation of the view protocol.
Tests were fixed with workaround (via disable/enable service connection)
2024-09-25 11:44:28 +07:00
Sergei Golubchik
dd1cad7e5f galera_3nodes.MDEV-29171 fails
set transferfmt in .cnf file like other galera tests do.
otherwise it defaults to socat when mtr detected that only nc is available
2024-09-24 14:30:24 +02:00
Denis Protivensky
231900e5bb MDEV-34836: TOI on parent table must BF abort SR in progress on a child
Applied SR transaction on the child table was not BF aborted by TOI running
on the parent table for several reasons:

Although SR correctly collected FK-referenced keys to parent, TOI in Galera
disregards common certification index and simply sets itself to depend on
the latest certified write set seqno.

Since this write set was the fragment of SR transaction, TOI was allowed to
run in parallel with SR presuming it would BF abort the latter.

At the same time, DML transactions in the server don't grab MDL locks on
FK-referenced tables, thus parent table wasn't protected by an MDL lock from
SR and it couldn't provoke MDL lock conflict for TOI to BF abort SR transaction.

In InnoDB, DDL transactions grab shared MDL locks on child tables, which is not
enough to trigger MDL conflict in Galera.

InnoDB-level Wsrep patch didn't contain correct conflict resolution logic due to
the fact that it was believed MDL locking should always produce conflicts correctly.

The fix brings conflict resolution rules similar to MDL-level checks to InnoDB,
thus accounting for the problematic case.

Apart from that, wsrep_thd_is_SR() is patched to return true only for executing
SR transactions. It should be safe as any other SR state is either the same as
for any single write set (thus making the two logically equivalent), or it reflects
an SR transaction as being aborting or prepared, which is handled separately in
BF-aborting logic, and for regular execution path it should not matter at all.

Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
2024-09-24 11:14:01 +02:00
Marko Mäkelä
971cf59579 Merge 10.6 into 10.11 2024-09-24 08:49:20 +03:00
Julius Goryavsky
f1b4d36cc3 Merge branch '10.11' into '11.2' 2024-09-24 05:23:37 +02:00
Oleksandr Byelkin
c9f54e20d4 MDEV-33990: SHOW STATUS counts ER_CON_COUNT_ERROR as Connection_errors_internal
Bring info about cause of closing connection in the place where we increment
statistics to do it correctly.
2024-09-23 16:16:51 +02:00
Sergei Golubchik
bbc62b1b9e clarify --thread-pool-mode usage 2024-09-23 12:51:21 +02:00
Sergei Golubchik
99837b6df6 restore --clent-rr after 7d86751de5 2024-09-23 12:51:21 +02:00
Lena Startseva
71649b93cf MDEV-31933: Make working view-protocol + ps-protocol (running two protocols together)
Fix for v. 10.6
2024-09-23 11:57:06 +07:00
mariadb-DebarunBanerjee
35d477dd1d MDEV-34453 Trying to read 16384 bytes at 70368744161280 outside the bounds of the file: ./ibdata1
The issue is caused by a race between buf_page_create_low getting the
page from buffer pool hash and buf_LRU_free_page evicting it from LRU.

The issue is introduced in 10.6 by MDEV-27058
commit aaef2e1d8c
MDEV-27058: Reduce the size of buf_block_t and buf_page_t

The solution is buffer fix the page before releasing buffer pool mutex
in buf_page_create_low when x_lock_try fails to acquire the page latch.
2024-09-20 20:26:43 +05:30
Alexander Barkov
681609d8a0 MDEV-32891 Assertion `value <= ((ulonglong) 0xFFFFFFFFL) * 10000ULL' failed in str_to_DDhhmmssff_internal
Fixing the wrong assert.
2024-09-20 15:39:09 +04:00
Alexander Barkov
607fc15393 MDEV-31302 Assertion `mon > 0 && mon < 13' failed in my_time_t sec_since_epoch(int, int, int, int, int, int)
The code erroneously called sec_since_epoch() for dates with zeros,
e.g. '2024-00-01'.
Fixi: adding a test that the date does not have zeros before
calling TIME_to_native().
2024-09-20 14:13:53 +04:00
Alexander Barkov
9ac8172ac3 MDEV-31221 UBSAN runtime error: negation of -9223372036854775808 cannot be represented in type 'long long int' in my_strtoll10_utf32
The code in my_strtoll10_mb2 and my_strtoll10_utf32
could hit undefinite behavior by negation of LONGLONG_MIN.
Fixing to avoid this.

Also, fixing my_strtoll10() in the same style.
The previous reduction produced a redundant warning on
CAST(_latin1'-9223372036854775808' AS SIGNED)
2024-09-20 13:04:57 +04:00
Alexander Barkov
841dc07ee1 MDEV-28386 UBSAN: runtime error: negation of -X cannot be represented in type 'long long int'; cast to an unsigned type to negate this value to itself in my_strntoull_8bit on SELECT ... OCT
The code in my_strntoull_8bit() and my_strntoull_mb2_or_mb4()
could hit undefinite behavior by negating of LONGLONG_MIN.
Fixing the code to avoid this.
2024-09-20 11:01:31 +04:00
Sergei Petrunia
8478a06cdb MDEV-34380: Set optimizer_switch='cset_narrowing=on' by default 2024-09-19 11:22:05 +03:00
Julius Goryavsky
cb83ae210c galera mtr suite: fixes for unstable tests 2024-09-19 09:02:46 +02:00
Lena Startseva
0a5e4a0191 MDEV-31005: Make working cursor-protocol
Updated tests: cases with bugs or which cannot be run
with the cursor-protocol were excluded with
"--disable_cursor_protocol"/"--enable_cursor_protocol"

Fix for v.10.5
2024-09-18 18:39:26 +07:00
Daniel Black
450040e0da MDEV-34952 main.log_slow test failure on opensuse builder
The loose regex for the MDEV-34539 test ended up
matching the opensuse in the path in buildbot.

Adjust to more complete regex including space,
backtick and \n, which becomes much less common
as a path name.
2024-09-18 18:46:27 +10:00
Daniel Black
391c9db486 MDEV-34952 main.log_slow test failure on opensuse builder
The loose regex for the MDEV-34539 test ended up
matching the opensuse in the path in buildbot.

Adjust to more complete regex including space,
backtick and \n, which becomes much less common
as a path name.
2024-09-18 17:00:27 +10:00
Marko Mäkelä
7ea9e1358f Merge 11.2 into 11.4 2024-09-18 08:07:22 +03:00
Marko Mäkelä
e782e416ac Merge 10.11 into 11.2 2024-09-18 07:38:49 +03:00
Yuchen Pei
1f7f406b7b
Merge branch '11.2' into 11.4 2024-09-18 11:27:53 +10:00
Yuchen Pei
ff88633b9c
Merge branch '10.11' into 11.2 2024-09-18 10:45:26 +10:00
Yuchen Pei
cfa9784edb
Merge branch '10.11' into 11.2 2024-09-18 10:25:16 +10:00
Brandon Nesterenko
68938d2b42 MDEV-33500 (part 2): rpl.rpl_parallel_sbm can still fail
The failing test case validates Seconds_Behind_Master for a delayed
slave, while STOP SLAVE is executed during a delay. The test fixes
initially added to the test (commit b04c857596) added a table lock
to ensure a transaction could not finish before validating the
Seconds_Behind_Master field after SLAVE START, but did not address a
possibility that the transaction could finish before running the
STOP SLAVE command, which invalidates the validations for the rest
of the test case. Specifically, this would result in 1) a timeout in
“Waiting for table metadata lock” on the replica, which expects the
transaction to retry after slave restart and hit a lock conflict on
the locked tables (added in b04c857596), and 2) that
Seconds_Behind_Master should have increased, but did not.

The failure can be reproduced by synchronizing the slave to the master
before the MDEV-32265 echo statement (i.e. before the SLAVE STOP).

This patch fixes the test by adding a mechanism to use DEBUG_SYNC to
synchronize a MASTER_DELAY, rather than continually increase the
duration of the delay each time the test fails on buildbot. This is
to ensure that on slow machines, a delay does not pass before the
test gets a chance to validate results. Additionally, it decreases
overall test time because the test can continue immediately after
validation, thereby bypassing the remainder of a full delay for each
transaction.
2024-09-17 06:29:20 -06:00
Sergei Petrunia
5cd3fa81ef Merge 10.11 -> 11.2 2024-09-17 12:34:33 +03:00
Alexander Barkov
a1adabdd5c MDEV-25900 Assertion octets < 1024' failed in Binlog_type_info_fixed_string::Binlog_type_info_fixed_string OR Assertion field_length < 1024' failed in Field_string::save_field_metadata
A CHAR column cannot be longer than 1024, because
Binlog_type_info_fixed_string::Binlog_type_info_fixed_string
replies on this fact - it cannot store binlog metadata for longer columns.

In case of the filename character set mbmaxlen is equal to 5,
so only 1024/5=204 characters can fit into the 1024 limit.
- In strict mode:
  Disallowing creation of a CHAR column with octet length grater than 1024.
- In non-strict mode:
  Automatically convert CHAR with octet length>1024 into VARCHAR.
2024-09-17 08:50:26 +04:00
Julius Goryavsky
f176248d4b Merge branch '10.6' into '10.11' 2024-09-17 06:23:10 +02:00
Julius Goryavsky
80fff4c6b1 Merge branch '10.5' into '10.6' 2024-09-16 16:39:59 +02:00
Marko Mäkelä
b187414764 Merge 10.6 into 10.11 2024-09-16 10:58:40 +03:00
Julius Goryavsky
7ee0e60bbb galera mtr tests: minor fixes to make tests more reliable 2024-09-15 05:05:03 +02:00
Marko Mäkelä
762ad01c7f Merge 11.2 into 11.4 2024-09-13 13:09:23 +03:00
Dave Gosselin
95885261f0 MDEV-27037 mysqlbinlog emits a warning when reaching EOF before stop-datetime
Emit a warning in the event that we finished processing input files
before reaching the boundary indicated by --stop-datetime.
2024-09-12 08:43:29 -04:00
Dave Gosselin
242b67f1de MDEV-27037 mysqlbinlog emits a warning when reaching EOF before stop-condition
Emit a warning in the event that we finished processing input files
before reaching the boundary indicated by --stop-position.
2024-09-12 08:43:29 -04:00