Calling SetArrayOptions with Nodes[i - 1].Key as its nm argument
exposed a MemorySanitizer error when i=0 for the tests:
* connect.bson_udf
* connect.json_udf
* connect.json_udf_bin
Its assumed that a basic optimization would have eliminated
these invalid expressions.
As the nm argument was unused, it has been removed.
Clang ~16+ on MSAN became quite strict with uninitalized
data being passed and returned from functions. Non-debug builds
have a basic optimization that hides these from those builds
Two innodb cases violate the assumptions, however once inlined
with a basic optimization those that existed for uninitialized
values are removed.
(MDEV-36316) rec_set_bit_field_2 calling mach_read_from_2 hits a read of
bits it wasn't actually changing.
(MDEV-36327) The function dict_process_sys_columns_rec left
nth_v_col uninitialized unless it was a virtual column. This was
ok as the function i_s_sys_columns_fill_table also didn't read
this value unless it was a virtual column.
Problem:
=======
- In 10.11, During Copy algorithm, InnoDB does use bulk insert
for row by row insert operation. When temporary directory
ran out of memory, row_mysql_handle_errors() fails to handle
DB_TEMP_FILE_WRITE_FAIL.
- During inplace algorithm, concurrent DML fails to write
the log operation into the temporary file. InnoDB fail to
mark the error for the online log.
- ddl_log_write() releases the global ddl lock prematurely before
release the log memory entry
Fix:
===
row_mysql_handle_errors(): Rollback the transaction when
InnoDB encounters DB_TEMP_FILE_WRITE_FAIL
convert_error_code_to_mysql(): Report an aborted transaction
when InnoDB encounters DB_TEMP_FILE_WRITE_FAIL during
alter table algorithm=copy or innodb bulk insert operation
row_log_online_op(): Mark the error in online log when
InnoDB ran out of temporary space
fil_space_extend_must_retry(): Mark the os_has_said_disk_full
as true if os_file_set_size() fails
btr_cur_pessimistic_update(): Return error code when
btr_cur_pessimistic_insert() fails
ddl_log_write(): Release the global ddl lock after releasing
the log memory entry when error was encountered
btr_cur_optimistic_update(): Relax the assertion that
blob pointer can be null during rollback because InnoDB can
ran out of space while allocating the external page
ha_innobase::extra(): Rollback the transaction during DDL before
calling convert_error_code_to_mysql().
row_undo_mod_upd_exist_sec(): Remove the assertion which says
that InnoDB should fail to build index entry when rollbacking
an incomplete transaction after crash recovery. This scenario
can happen when InnoDB ran out of space.
row_upd_changes_ord_field_binary_func(): Relax the assertion to
make that externally stored field can be null when InnoDB ran out
of space.
Problem:
=======
- During inplace algorithm, concurrent DML fails to write
the log operation into the temporary file. InnoDB fail to
mark the error for the online log.
- ddl_log_write() releases the global ddl lock prematurely before
release the log memory entry
Fix:
===
row_log_online_op(): Mark the error in online log when
InnoDB ran out of temporary space
fil_space_extend_must_retry(): Mark the os_has_said_disk_full
as true if os_file_set_size() fails
btr_cur_pessimistic_update(): Return error code when
btr_cur_pessimistic_insert() fails
ddl_log_write(): Release the global ddl lock after releasing the
log memory entry when error was encountered
btr_cur_optimistic_update(): Relax the assertion that
blob pointer can be null during rollback because InnoDB can
ran out of space while allocating the external page
row_undo_mod_upd_exist_sec(): Remove the assertion which says
that InnoDB should fail to build index entry when rollbacking
an incomplete transaction after crash recovery. This scenario
can happen when InnoDB ran out of space.
row_upd_changes_ord_field_binary_func(): Relax the assertion to
make that externally stored field can be null when InnoDB ran out
of space.
When table2myisam() prepares recinfo structures BIT field was skipped
because pack_length_in_rec() returns 0. Instead of BIT field
DB_ROW_HASH_1 field was taken into recinfo structure and its length
was added to reclength. This is wrong because not stored fields must
not be prepared as record columns (MI_COLUMNDEF) in storage
layer. 0-length fields are prepared in "reserve space for null bits"
branch.
The problem only occurs with tables where there is no data for the
main record outside of the null bits.
The fix updates minpos condition so we avoid fields after
stored_rec_length and these are not stored by
definition. share->reclength already includes not stored lengths from
CREATE TABLE so we cannot use it as minpos starting point.
In Aria there is no "reserve space for null bits" and it does not
create column definition for BIT. Also there is no
setup_vcols_for_repair() to reproduce the issue. But nonetheless it
creates column definition for not stored fields redundantly, so we
patch table2maria() as well. The test case for Aria tries to
demonstrate BIT field works, it does not reproduce any issues (as
redundant column definition for not stored field seem to not cause any
problems).
ER_DUP_ENTRY on partitioned table
Now as c1492f3d07 (MDEV-36115) restores m_last_part table->file
points to partition p0 while the error happens in p1, so error index
does not match ib_table in innobase_get_mysql_key_number_for_index().
This case is handled by separate code block in
innobase_get_mysql_key_number_for_index() which was wrong on using
secondary index for dict_index_is_auto_gen_clust() and it was not
covered by the tests.
page_is_corrupted(): Do not allocate the buffers from stack,
but from the heap, in xb_fil_cur_open().
row_quiesce_write_cfg(): Issue one type of message when we
fail to create the .cfg file.
update_statistics_for_table(), read_statistics_for_table(),
delete_statistics_for_table(), rename_table_in_stat_tables():
Use a common stack buffer for Index_stat, Column_stat, Table_stat.
ha_connect::FileExists(): Invoke push_warning_printf() so that
we can avoid allocating a buffer for snprintf().
translog_init_with_table(): Do not duplicate TRANSLOG_PAGE_SIZE_BUFF.
Let us also globally enable the GCC 4.4 and clang 3.0 option
-Wframe-larger-than=16384 to reduce the possibility of introducing
such stack overflow in the future. For RocksDB and Mroonga we relax
these limits.
Reviewed by: Vladislav Lesin
In commit b6923420f3 (MDEV-29445)
some hash tables were accidentally created with the minimum size
(101 entries) instead of correctly deriving the size from the
initial innodb_buffer_pool_size. This led to very long hash bucket
chains, which are very slow to traverse.
ut_find_prime(): Assert that the size is nonzero in order to catch
this type of regression in the future.
innodb_init_params(): Do not bother reading buf_pool.curr_size()
when it is known to be 0,
srv_start(): Correctly initialize srv_lock_table_size to 5 times
buf_pool.curr_size(), that is, the buffer pool size in pages,
between invoking buf_pool.create() and lock_sys.create().
btr_search_enable(), dict_sys_t::create(), dict_sys_t::resize():
Correctly refer to buf_pool.curr_pool_size(), that is,
innodb_buffer_pool_size in bytes, when calculating the hash table size.
In MDEV-29445 the expressions buf_pool_get_curr_size() were
accidentally replaced with buf_pool.curr_size().
buf_buddy_shrink(): Properly cover the case when KEY_BLOCK_SIZE
corresponds to the innodb_page_size, that is, the ROW_FORMAT=COMPRESSED
page frame is directly allocated from the buffer pool, not via the
binary buddy allocator.
buf_LRU_check_size_of_non_data_objects(): Avoid a crash when the
buffer pool is being shrunk.
buf_pool_t::shrink(): Abort if over 95% of the shrunk buffer pool
would be occupied by the adaptive hash index or record locks.
In commit b6923420f3 (MDEV-29445)
we started to specify the MAP_POPULATE flag for allocating the
InnoDB buffer pool. This would cause a lot of time to be spent
on __mm_populate() inside the Linux kernel, such as 16 seconds
to pre-fault or commit innodb_buffer_pool_size=64G.
Let us revert to the previous way of allocating the buffer pool
at startup. Note: An attempt to increase the buffer pool size by
SET GLOBAL innodb_buffer_pool_size (up to innodb_buffer_pool_size_max)
will invoke my_virtual_mem_commit(), which will use MAP_POPULATE
to zero-fill and prefault the requested additional memory area, blocking
buf_pool.mutex.
Before MDEV-29445 we allocated the InnoDB buffer pool by invoking
mmap(2) once (via my_large_malloc()). After the change, we would
invoke mmap(2) twice, first via my_virtual_mem_reserve() and then
via my_virtual_mem_commit(). Outside Microsoft Windows, we are
reverting back to my_large_malloc() like allocation.
my_virtual_mem_reserve(): Define only for Microsoft Windows.
Other platforms should invoke my_large_virtual_alloc() and
update_malloc_size() instead of my_virtual_mem_reserve() and
my_virtual_mem_commit().
my_large_virtual_alloc(): Define only outside Microsoft Windows.
Do not specify MAP_NORESERVE nor MAP_POPULATE, to preserve compatibility
with my_large_malloc(). Were MAP_POPULATE specified, the mmap()
system call would be significantly slower, for example 18 seconds
to reserve 64 GiB upfront.
log_t::append_prepare_wait(): Do not attempt to read log_sys.write_lsn
because it is not protected by log_sys.latch but by write_lock, which
we cannot hold here. The assertion could fail if log_t::write_buf()
is executing concurrently, and it has not yet executed log_write_buf()
or updated log_sys.write_lsn.
Fixes up commit acd071f599 (MDEV-21923)
log_t::append_prepare_wait(): Do not attempt to read log_sys.write_lsn
because it is not protected by log_sys.latch but by write_lock, which
we cannot hold here. The assertion could fail if log_t::write_buf()
is executing concurrently, and it has not yet executed log_write_buf()
or updated log_sys.write_lsn.
Fixes up commit acd071f599 (MDEV-21923)
Valgrind is single threaded and only changes threads as part of
system calls or waits.
Some busy loops were identified and fixed where the server assumes
that some other thread will change the state, which will not happen
with valgrind.
Based on patch by Monty. Original patch introduced VALGRIND_YIELD,
which emits pthread_yield() only in valgrind builds. However it was
agreed that it is a good idea to emit yield() unconditionally, such
that other affected schedulers (like SCHED_FIFO) benefit from this
change. Also avoid pthread_yield() in favour of standard
std::this_thread::yield().
The minimum statistics level now is rocksdb::StatsLevel::kDisableAll.
The default remains rocksdb::StatsLevel::kExceptHistogramOrTimers
which is now 1 (it used to be 0).
With view protocol collation_connection is reset in mysql_make_view in
the "SELECT * FROM mysqltest_tmp_v" query. In the case of
spider/bugfix.mdev_33434, it is reset to latin1_swedish_ci, with the
latin1 charset.
This results in no conversion needed since it is the same as
character_set_client and the corresponding argument in the udf remains
unchanged, with "dummy" srv value. Thus the reported error is
1477: 'The foreign server name you are trying to reference does not exist. Data source error: dummy'
Without view protocol, the character_set_connection ucs2 setting in
the test survives, and the conversion results in empty connection
parameters, and the reported error is 1429
ER_CONNECT_TO_FOREIGN_DATA_SOURCE
This failure is irrelevant to the test, or to spider at all. Therefore
we disable view protocol for the statement.
In spider/bugfix.mdev_29352, with flush tables with read lock,
statements blocked in THD::has_read_only_protection() by checking
THD::global_read_lock could result in view protocol to "hang" waiting
for acquiring mdl in another THD.
In spider/bugfix.mdev_34555, within an XA transaction, statements
blocked by trans_check() by checking thd->transaction->xid_state could
result in view protocol to "hang" for the same reason.
Therefore we disable view protocol for relevant statements in these
tests.
Spider tables do not support SELECT SQL_CALC_FOUND_ROWS and the
correct test output is a coincidence. Debugging shows that the
limit_found_rows field was last updated in an unrelated statement:
SELECT STRAIGHT_JOIN a.a, a.b, date_format(b.c, '%Y-%m-%d %H:%i:%s')\nFROM ta_l a, tb_l b WHERE a.a = b.a ORDER BY a.a
As a byproduct, this fixes the "wrong found_rows() results" when
running these tests with view protocol.
The failure is caused by exec $stmt where $stmt has two queries.
mtr with view protocol transforms the first query into a view, leaving
the second query executed in the usual way. mtr being oblivious about
the second query then does not handle its results, resulting in
CR_COMMANDS_OUT_OF_SYNC. We disable view protocol for such edge cases.
After fixing these "Failed to drop view: 0: " further failures emerge
from two of the tests, which are the same problem as MDEV-36454, so we
fix them to by disabling view protocol for the relevant SELECTs.
Reason:
=======
- MDEV-16239 does apply the DML logs after bulk insert for
ALTER TABLE..ALGORITHM=COPY, but InnoDB fails to reset the bulk_insert
in ha_innobase::extra(HA_EXTRA_END_ALTER_COPY). This leads to crash
while applying DML logs.
Solution:
=======
ha_innobase::extra(HA_EXTRA_END_ALTER_COPY): Reset TRX_DDL_BULK at the
end of bulk insert operation
A statement SET GLOBAL innodb_buffer_pool_size=...
could fail for no good reason when the buffer pool contains many
pages that can actually be evicted.
buf_flush_LRU_list_batch(): Keep evicting as long as the buffer pool
is being shrunk, for at most innodb_lru_scan_depth extra blocks.
Disregard the flush limit for pages that are marked as freed in files.
buf_flush_LRU_to_withdraw(): Update the to_withdraw target during
buf_flush_LRU_list_batch().
buf_pool_t::will_be_withdrawn(): Allow also ptr=nullptr (the condition
will not hold for it).
This fixes a regression that was introduced in
commit b6923420f3 (MDEV-29445)
and caught by the test innodb.temp_truncate_freed in MariaDB Server 11.4.
Tested by: Thirunarayanan Balathandayuthapani
Reviewed by: Thirunarayanan Balathandayuthapani
Set solution is to check if transaction, which modified a record, is
still active in lock_clust_rec_read_check_and_lock(). if yes, then just
request a lock. If no, then, depending on if the current transaction read
view can see the changes, return eighter DB_RECORD_CHANGED or request a
lock.
We can do the check in lock_clust_rec_read_check_and_lock() because
transaction tries to set a lock on the record which cursor points to after
transaction resuming and cursor position restoring. If the lock already
exists, then we don't request the lock again. But for the current commit
it's important that lock_clust_rec_read_check_and_lock() will be invoked
again for the same record, so we can do the check again after
transaction, which modified a record, was committed or rolled back.
MDEV-33802(4aa9291) is partially reverted. If some transaction holds
implicit lock on some record and transaction with snapshot isolation level
requests conflicting lock on the same record, it should be blocked instead
of returning DB_RECORD_CHANGED to have ability to continue execution when
implicit lock owner is rolled back.
The construction
--------------------------------------------------------------------------
let $wait_condition=
select count(*) = 1 from information_schema.processlist
where state = 'Updating' and info = 'UPDATE t SET b = 2 WHERE a';
--source include/wait_condition.inc
--------------------------------------------------------------------------
is not reliable enought to make sure transaction is blocked in test
case, the test failed sporadically with
--------------------------------------------------------------------------
./mtr --max-test-fail=1 --parallel=96 lock_isolation{,,,,,,,}{,,,}{,,} \
--repeat=500
--------------------------------------------------------------------------
command. That's why it was replaced with debug sync-points.
Reviewed by: Marko Mäkelä
during mariabackup --prepare
Reason:
======
During --prepare of partial backup, if InnoDB encounters the redo log
for the excluded tablespace then InnoDB stores the space id in dirty
tablespace list during recovery, anticipates that it may encounter
FILE_* redo log records in the future. Even though we encounter FILE_*
record for the partial excluded tablespace then we fail to replace the
name in dirty tablespace list. This lead to missing of
FILE_* redo log records error.
Solution:
========
fil_name_process(): Rename the file name from "" to name encountered
during FILE_* record
recv_init_missing_space(): Correct the condition to print the warning
message of missing tablespace during mariabackup restore process.
- InnoDB fails to check the table is being dropped or evicted
while acquiring the MDL for the table when table open operation
mode is DICT_TABLE_OP_OPEN_ONLY_IF_CACHED. This is caused by
the commit 337bf8ac4b (MDEV-36122)
Fix:
===
dict_acquire_mdl_shared(): If the table is evicted or dropped when
table operation mode is DICT_TABLE_OP_OPEN_IF_CACHED then return
nullptr
ha_innobase::info_low(): Assert that dict_table_t::stat_initialized()
only within the critical section. Changes of this field should be
protected by dict_table_t::lock_latch.
Problem:
========
- After commit cc8eefb0dc (MDEV-33087),
InnoDB does use bulk insert operation for ALTER TABLE.. ALGORITHM=COPY
and CREATE TABLE..SELECT as well. InnoDB fails to clear the bulk
buffer when it encounters error during CREATE..SELECT. Problem
is that while transaction cleanup, InnoDB fails to identify
the bulk insert for DDL operation.
Fix:
====
- Represent bulk_insert in trx by 2 bits. By doing that, InnoDB
can distinguish between TRX_DML_BULK, TRX_DDL_BULK. During DDL,
set bulk insert value for transaction to TRX_DDL_BULK.
- Introduce a parameter HA_EXTRA_ABORT_ALTER_COPY which rollbacks
only TRX_DDL_BULK transaction.
- bulk_insert_apply() happens for TRX_DDL_BULK transaction happens
only during HA_EXTRA_END_ALTER_COPY extra() call.
...when using io_uring on a potentially affected kernel"
Remove version check on the kernel as it now corresponds to
a working RHEL9 kernel and the problem was only there in
pre-release kernels that shouldn't have been used in production.
This reverts commit 1193a793c4.
Remove version check on the kernel as it now corresponds to
a working RHEL9 kernel and the problem was only there in
pre-release kernels that shouldn't have been used in production.
This reverts commit 3dc0d884ec.
Starting with mysql/mysql-server@02f8eaa998
and commit 2e814d4702 the index ID of
indexes on virtual columns was being encoded insufficiently in
InnoDB undo log records. Only the least significant 32 bits were
being written. This could lead to some corruption of the affected
indexes on ROLLBACK, as well as to missed chances to remove some
history from such indexes when purging the history of committed
transactions that included DELETE or an UPDATE in the indexes.
dict_hdr_create(): In debug instrumented builds, initialize the
DICT_HDR_INDEX_ID close to the 32-bit barrier, instead of initializing
it to DICT_HDR_FIRST_ID (10). This will allow the changed code to
be exercised while running ./mtr --suite=gcol,vcol.
trx_undo_log_v_idx(): Encode large index->id in a similar way as
mysql/mysql-server@e00328b4d0
but using a different implementation.
trx_undo_read_v_idx_low(): Decode large index->id in a similar way
as mach_u64_read_much_compressed().
Reviewed by: Debarun Banerjee
Added retry logic to certain file operations during installation as a
workaround for issues caused by buggy antivirus software on Windows.
Retry logic added for WritePrivateProfileString (mysql_install_db.cc)
and renaming file in Innodb.
Problem was that thread was holding lock_sys.wait_mutex when
streaming replication transaction rollback was handled and
in wsrep-lib requests THD::LOCK_thd_kill mutex causing
wrong mutex usage (thd->reset_globals()).
Fix is to remove streaming replication rollback handling
from Deadlock::report() i.e. wsrep_handle_SR_rollback call.
Purpose of Deadloc::report() is to find a cycle in the
waits-for graph if exists, report it, mark victim transaction
as deadlock victim and release locks it is waiting for.
Actual streaming replication rollback that can take longer
time can be handled later at trx_t::rollback where
lock_sys.wait_mutex is not held.
Signed-off-by: Julius Goryavsky <julius.goryavsky@mariadb.com>
log_t::clear_mmap(): Do not modify buf_size; we may have
file_size==0 here during bootstrap.
log_t::set_recovered(): If we are writing to a memory-mapped log,
update log_sys.buf_size to the record payload area of log_sys.buf.
This fixes up commit acd071f599
(MDEV-21923).