In previous versions it was stated that MDEV-25968 was causing other
jobs in the pipeline to fail if not run with "-j 2" but this bug was not
affecting fedora-ninja. This is still true for the public gitlab runners.
However, running the fedora-ninja job on custom runners with more processors
without the "-j 2" flag will cause the compiler to crash.
When running the build with 2,4,8,16,32 threads, build times were
consistent indicating that the typical bottleneck is I/O and not CPU
cores. Therefore, "-j 2" is not a big drawback and it was chosen in
order to remain consistent with the other builds affected by MDEV-25968.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
Problem:
=======
ALTER TABLE in InnoDB fails to detect duplicate entries
for the unique index when the character set or collation of
an indexed column is changed in such a way that the character
encoding is compatible with the old table definition.
In this case, any secondary indexes on the changed columns
would be rebuilt (DROP INDEX, ADD INDEX).
Solution:
========
During ALTER TABLE, InnoDB keeps track of columns whose collation
changed, and will fill in the correct metadata when sorting the
index records, or applying changes from concurrent DML.
This metadata will be allocated in the dict_index_t::heap of
the being-created secondary indexes.
The fix was developed by Thirunarayanan Balathandayuthapani
and simplified by me.
Depending on OpenSSL version, and at least in 3.0.3, the client-side socket
timeout is reported as generic error (SSL_ERROR_SYSCALL), losing further
details (both errno and GetLastError() return 0). This results in client
reporting "Unknown OpenSSL error" 2026, instead of another generic
"Lost connection to server during query" 2013
Adjusted test case.
Part of MDEV-29000
OpenSSL 3.0.0+ does not support EVP_MD_CTX_FLAG_NON_FIPS_ALLOW any longer.
In OpenSSL 1.1.1 the non FIPS allowed flag is context specific, while
in 3.0.0+ it is a different EVP_MD provider.
Fixes#2010
part of MDEV-29000
Summary of changes
- MD_CTX_SIZE is increased
- EVP_CIPHER_CTX_buf_noconst(ctx) does not work anymore, points
to nobody knows where. The assumption made previously was that
(since the function does not seem to be documented)
was that it points to the last partial source block.
Add own partial block buffer for NOPAD encryption instead
- SECLEVEL in CipherString in openssl.cnf
had been downgraded to 0, from 1, to make TLSv1.0 and TLSv1.1 possible
(according to https://github.com/openssl/openssl/blob/openssl-3.0.0/NEWS.md
even though the manual for SSL_CTX_get_security_level claims that it
should not be necessary)
- Workaround Ssl_cipher_list issue, it now returns TLSv1.3 ciphers,
in addition to what was set in --ssl-cipher
- ctx_buf buffer now must be aligned to 16 bytes with openssl(
previously with WolfSSL only), ot crashes will happen
- updated aes-t , to be better debuggable
using function, rather than a huge multiline macro
added test that does "nopad" encryption piece-wise, to test
replacement of EVP_CIPHER_CTX_buf_noconst
part of MDEV-29000
ha_innobase::check_if_supported_inplace_alter(): Refuse to change the
collation of a column that would become or remain indexed as part of
the ALTER TABLE operation.
In MariaDB Server 10.6, we will allow this type of operation;
that fix depends on MDEV-15250.
Starting with 10.5, InnoDB crash recovery tests seem to time out
more easily under Valgrind, which emulates multiple threads by
interleaving them in a single operating system thread.
These tests will still be covered by
AddressSanitizer and MemorySanitizer.
- InnoDB mistakenly identifies the non-unique FTS_DOC_ID index as
FTS_DOC_ID_INDEX while loading the table. dict_load_indexes()
should check whether the index is unique before assigning
fts_doc_id_index
Before version 10, GCC would think that a right shift of an
unsigned char returns int. Let us explicitly cast that back,
to silence a bogus -Wconversion warning.
- In case of discarded tablespace, InnoDB can't read the root page to
assign the n_core_null_bytes. Consecutive instant DDL fails because
of non-matching n_core_null_bytes.
fil_space_t::acquire_low(): Introduce a parameter that specifies
which flags should be avoided. At all times, referenced() must not
be incremented if the STOPPING flag is set. When fil_system.mutex
is not being held by the current thread, the reference must not be
incremented if the CLOSING flag is set (unless NEEDS_FSYNC is set,
in fil_space_t::flush()).
fil_space_t::acquire(): Invoke acquire_low(STOPPING | CLOSING).
In this way, the reference count cannot be incremented after
fil_space_t::try_to_close() invoked fil_space_t::set_closing().
If the CLOSING flag was set, we must retry acquire_low() after
acquiring fil_system.mutex.
fil_space_t::prepare_acquired(): Replaces prepare(true).
fil_space_t::acquire_and_prepare(): Replaces prepare().
This basically retries fil_space_t::acquire() after
acquiring fil_system.mutex.
- Redundant InnoDB table fails to set the flags2 while loading
the table. It leads to "Upgrade index name failure" during alter
operation. InnoDB should set the flags2 to FTS_AUX_HEX_NAME when
fts is being loaded
recv_sys_t::recover_deferred(): Hold the exclusive page latch until
the tablespace has been set up. Otherwise, the write of the page
may be lost due to non-existent tablespace. This race only affects
the recovery of the first page in a newly created tablespace.
This race condition was introduced in MDEV-24626.
Log MDL state transitions. Trace-friendly message
format. DBUG_LOCK_FILE replaced by thread-local storage.
Logged states legend:
Seized lock was acquired without waiting
Waiting lock is waiting
Acquired lock was acquired after waiting
Released lock was released
Deadlock lock was aborted due to deadlock
Timeout lock was aborted due to timeout >0
Nowait lock was aborted due to zero timeout
Killed lock was aborted due to kill message
OOM can not acquire because out of memory
Usage:
mtr --mysqld=--debug=d,mdl,query:i:o,/tmp/mdl.log
Cleanup from garbage messages:
sed -i -re \
'/(mysql|performance_schema|sys|mtr)\// d; /MDL_BACKUP_/ d' \
/tmp/mdl.log
Ever since commit 9608773f75
the InnoDB persistent statistics are enabled on all InnoDB tables
by default. We must filter out any output that indicates that the
statistics tables are being internally accessed by InnoDB.
The issue was that flush_tables() didn't take a MDL lock on cached
TABLE_SHARE before calling open_table() to do a HA_EXTRA_FLUSH call.
Most engines seams to have no issue with it, but apparantly this conflicts
with InnoDB in 10.6 when using TRUNCATE
Fixed by taking a MDL lock before trying to open the table in
flush_tables().
There is no test case as it hard to repeat the scheduling that causes
the error. I did run the test case in MDEV-28897 to verify
that the bug is fixed.
The test was reported to fail sporadicaly with this diff:
--- mysql-test/main/information_schema_tables.result
+++ mysql-test/main/information_schema_tables.reject
@@ -21,6 +21,8 @@
disconnect con1;
connection default;
DROP VIEW IF EXISTS vv;
+Warnings:
+Note 4092 Unknown VIEW: 'test.vv'
in the "The originally reported non-deterministic test" part.
Disabling warnings around the DROP VIEW statement.
The Spider mixes the comma join with other join types, and thus
ERROR 1054 occurs. This is well-known issue caused by the higher
precedence of JOIN over the comma (,).
We can fix the problem simply by using JOINs instead of commas.
The bug is caused by a similar mechanism as MDEV-21027.
The function, check_insert_or_replace_autoincrement, failed to open
all the partitions on INSERT SELECT statements and it results in the
assertion error.
The heap-use-after-free is caused by the following mechanism:
* In the execution of FLUSH TABLE WITH READ LOCK, the function
spider_free_trx_conn() is called and the connections held by
SPIDER_TRX::trx_conn_hash are freed.
* Then, an instance of ha_spider maintains the freed connections
because they are also referenced from ha_spider::conns.
The ha_spider instance is kept in a lock structure until the
corresponding table is unlocked.
* Spider accesses ha_spider::conns on the implicit UNLOCK TABLE
issued by BEGIN.
In the first place, when the connections have been freed, it means
that there are really no remote table locked by Spider.
Thus, there is no need for Spider to access ha_spider::cons on the
implicit UNLOCK TABLE.
We can fix the bug by removing the above mentioned access to
ha_spider::conns. We also modified spider_free_trx_conn() so that it
frees the connections only when no table is locked to reduce the
chance of another heap-use-after-free on ha_spider::conns.
prepare_inplace_alter_table_dict(): If the table will not be rebuilt,
preserve all of the original ROW_FORMAT, including the compressed
page size flags related to ROW_FORMAT=COMPRESSED.
btr_root_raise_and_insert(), btr_lift_page_up(),
rtr_page_split_and_insert(): Reset DB_FAIL from a failure to
copy records on a ROW_FORMAT=COMPRESSED page to DB_SUCCESS
before retrying.
This fixes a regression that was introduced by
commit 0b47c126e3 (MDEV-13542).
btr_root_raise_and_insert(): Remove a redundant condition.
btr_page_split_and_insert() will invoke btr_page_split_and_insert()
if needed.
On FreeBSD, tests run on persistent storage, and no asynchronous I/O
has been implemented. Warnings about 205-second waits on dict_sys.latch
may occur.