Assertions failed due to incorrect handling of the --tc-heuristic-recover
option when InnoDB is in read-only mode either due to innodb_read_only=1
or innodb_force_recovery>3. InnoDB failed to refuse a XA COMMIT or
XA ROLLBACK operation, and there were errors in the error handling in
the upper layer.
This was fixed by making InnoDB XA operations respect the
high_level_read_only flag. The InnoDB part of the fix and
parts of the test main.tc_heuristic_recover were provided
by Marko Mäkelä.
LOCK_log mutex lock/unlock had to be added to fix MDEV-13438.
The measure is confirmed by mysql sources as well.
For testing of the conflicting option combination, mysql-test-run is
made to export a new $MYSQLD_LAST_CMD. It holds the very last value
generated by mtr.mysqld_start(). Even though the options have been
also always stored in $mysqld->{'started_opts'} there were no access
to them beyond the automatic server restart by mtr through the expect
file interface.
Effectively therefore $MYSQLD_LAST_CMD represents a more general
interface to $mysqld->{'started_opts'} which can be used in wider
scopes including server launch with incompatible options.
Notice another existing method to restart the server with incompatible
options relying on $MYSQLD_CMD is is aware of $mysqld->{'started_opts'}
(the actual options that the server is launched by mtr). In order to use
this method they would have to be provided manually.
NOTE: When merging to 10.2, the file search_pattern_in_file++.inc
should be replaced with the pre-existing search_pattern_in_file.inc.
The function ibuf_remove_free_page() may be called while the caller
is holding several mutexes or rw-locks. Because of this, this
housekeeping loop may cause performance glitches for operations that
involve tables that are stored in the InnoDB system tablespace.
Also deadlocks might be possible.
The worst impact of all is that due to the mutexes being held, calls to
log_free_check() had to be skipped during this housekeeping.
This means that the cyclic InnoDB redo log may be overwritten.
If the system crashes during this, it would be unable to recover.
The entry point to the problematic code is ibuf_free_excess_pages().
It would make sense to call it before acquiring any mutexes or rw-locks,
in any 'pessimistic' operation that involves the system tablespace.
fseg_create_general(), fseg_alloc_free_page_general(): Do not call
ibuf_free_excess_pages() while potentially holding some latches.
ibuf_remove_free_page(): Do call log_free_check(), like every operation
that is about to generate redo log should do.
ibuf_free_excess_pages(): Remove some assertions that are replaced
by stricter assertions in the log_free_check() that is now called by
ibuf_remove_free_page().
row_ins_sec_index_entry(), row_undo_ins_remove_sec_low(),
row_undo_mod_del_mark_or_remove_sec_low(),
row_undo_mod_del_unmark_sec_and_undo_update(): Call
ibuf_free_excess_pages() if the operation may involve allocating pages
and change buffering in the system tablespace.
When MySQL 5.0.3 introduced InnoDB support for two-phase commit,
it also introduced the questionable logic to roll back XA PREPARE
transactions on startup when innodb_force_recovery is 1 or 2.
Remove this logic in order to avoid unwanted side effects when
innodb_force_recovery is being set for other reasons. That is,
XA PREPARE transactions will always remain in that state until
InnoDB receives an explicit XA ROLLBACK or XA COMMIT request
from the upper layer.
At the time the logic was introduced in MySQL 5.0.3, there already
was a startup parameter that is the preferred way of achieving
the behaviour: --tc-heuristic-recover=ROLLBACK.
Revert the following change, because Memcached is not present
in MariaDB Server. We had better avoid adding dead code.
commit d9bc5e03d788b958ce8c76e157239953db60adb2
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Thu May 18 14:31:01 2017 +0530
Bug #24605783 MYSQL GOT SIGNAL 6 ASSERTION FAILURE
Following merge from 5.6.36, this merge also rejects changes that
collided with the rejection of 6ca4f693c1ce472e2b1bf7392607c2d1124b4293.
We initially rejected 6ca4f693c1ce472e2b1bf7392607c2d1124b4293 because
it was introducing a new storage engine API method.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
log_calc_max_ages(): Use the requested size in the check, instead of
the detected redo log size. The redo log will be resized at startup
if it differs from what has been requested.
When a slow shutdown is performed soon after spawning some work for
background threads that can create or commit transactions, it is possible
that new transactions are started or committed after the purge has finished.
This is violating the specification of innodb_fast_shutdown=0, namely that
the purge must be completed. (None of the history of the recent transactions
would be purged.)
Also, it is possible that the purge threads would exit in slow shutdown
while there exist active transactions, such as recovered incomplete
transactions that are being rolled back. Thus, the slow shutdown could
fail to purge some undo log that becomes purgeable after the transaction
commit or rollback.
srv_undo_sources: A flag that indicates if undo log can be generated
or the persistent, whether by background threads or by user SQL.
Even when this flag is clear, active transactions that already exist
in the system may be committed or rolled back.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Do not return an error code; the operation never fails.
Clear the srv_undo_sources flag, and also ensure that the background
DROP TABLE queue is empty.
srv_purge_should_exit(): Do not allow the purge to exit if
srv_undo_sources are active or the background DROP TABLE queue is not
empty, or in slow shutdown, if any active transactions exist
(and are being rolled back).
srv_purge_coordinator_thread(): Remove some previous workarounds
for this bug.
innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
and srv_dict_stats_thread_active directly. Set srv_undo_sources before
starting the purge subsystem, to prevent immediate shutdown of the purge.
Create dict_stats_thread and fts_optimize_thread immediately
after setting srv_undo_sources, so that shutdown can use this flag to
determine if these subsystems were started.
dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
srv_shutdown_table_bg_threads(): Remove (unused).
InnoDB shutdown assumes that once the server has entered
SRV_SHUTDOWN_FLUSH_PHASE, no change to persistent data is allowed.
It was possible for the master thread to wake up while shutdown
is executing in SRV_SHUTDOWN_FLUSH_PHASE or
even in SRV_SHUTDOWN_LAST_PHASE.
We do not yet know if further crashes at shutdown are possible.
Also, we do not know if all the observed crashes could be explained
by the race conditions that we are now fixing.
srv_shutdown_print_master_pending(): Remove a redundant ut_time() call.
srv_shutdown(): Renamed from srv_master_do_shutdown_tasks().
srv_master_thread(): Do not resume after shutdown has been initiated.
This fixes warnings that were emitted when running InnoDB test
suites on a debug server that was compiled with GCC 7.1.0 using
the flags -O3 -fsanitize=undefined.
thd_requested_durability(): XtraDB can call this with trx->mysql_thd=NULL.
Remove the function in InnoDB, because it is not used there.
calc_row_difference(): Do not call memcmp(o_ptr, NULL, 0).
innobase_index_name_is_reserved(): This can be called with
key_info=NULL, num_of_keys=0.
innobase_dropping_foreign(), innobase_check_foreigns_low(),
innobase_check_foreigns(): This can be called with
drop_fk=NULL, n_drop_fk=0.
rec_convert_dtuple_to_rec_comp(): Do not invoke memcpy(end, NULL, 0).
On 64-bit systems, the constant 1 would be 32-bit (int or unsigned)
by default. Cast the constant to ulint before shifting to avoid a
-fsanitize=undefined warning or any potential overflow.
Fix a -fsanitizer=undefined warning that trx_undo_report_row_operation()
was being passed thr=NULL when the BTR_NO_UNDO_LOG_FLAG flag was set.
trx_undo_report_row_operation(): Remove the first two parameters.
The parameter clust_entry!=NULL distinguishes inserts from updates.
This should be a non-functional change (no observable change in
behaviour; slightly smaller code).
Allocate srv_sys statically so that the desired alignment can be
guaranteed. This silences -fsanitize=undefined warnings.
There probably is no performance impact of this, because the
reason for the alignment to ensure the absence of false sharing
between counters. Even with the misalignment, each counter would
have been been aligned at 64 bits, and the counters would reside
in separate cache lines.
The parameter thr of the function btr_cur_optimistic_insert()
is not declared as nonnull, but GCC 7.1.0 with -O3 is wrongly
optimizing away the first part of the condition
UNIV_UNLIKELY(thr && thr_get_trx(thr)->fake_changes)
when the function is being called by row_merge_insert_index_tuples()
with thr==NULL.
The fake_changes is an XtraDB addition. This GCC bug only appears
to have an impact on XtraDB, not InnoDB.
We work around the problem by not attempting to dereference thr
when both BTR_NO_LOCKING_FLAG and BTR_NO_UNDO_LOG_FLAG are set
in the flags. Probably BTR_NO_LOCKING_FLAG alone should suffice.
btr_cur_optimistic_insert(), btr_cur_pessimistic_insert(),
btr_cur_pessimistic_update(): Correct comments that disagree with
usage and with nonnull attributes. No other parameter than thr can
actually be NULL.
row_ins_duplicate_error_in_clust(): Remove an unused parameter.
innobase_is_fake_change(): Unused function; remove.
ibuf_insert_low(), row_log_table_apply(), row_log_apply(),
row_undo_mod_clust_low():
Because we will be passing BTR_NO_LOCKING_FLAG | BTR_NO_UNDO_LOG_FLAG
in the flags, the trx->fake_changes flag will be treated as false,
which is the right thing to do at these low-level operations
(change buffer merge, ALTER TABLE…LOCK=NONE, or ROLLBACK).
This might be fixing actual XtraDB bugs.
Other callers that pass these two flags are also passing thr=NULL,
implying fake_changes=false. (Some callers in ROLLBACK are passing
BTR_NO_LOCKING_FLAG and a nonnull thr. In these callers, fake_changes
better be false, to avoid corruption.)
This merge reverts commit 6ca4f693c1ce472e2b1bf7392607c2d1124b4293
from current 5.6.36 innodb.
Bug #23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE
INDEX APPLIED BY UNCOMMITTED ROW
Problem:
========
row_search_for_mysql() does whole table traversal for range query
even though the end range is passed. Whole table traversal happens
when the record is not with in transaction read view.
Solution:
=========
Convert the innodb last record of page to mysql format and compare
with end range if the traversal of row_search_mvcc() exceeds 100,
no ICP involved. If it is out of range then InnoDB can avoid the
whole table traversal. Need to refactor the code little bit to
make it compile.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Knut Hatlen <knut.hatlen@oracle.com>
Reviewed-by: Dmitry Shulga <dmitry.shulga@oracle.com>
RB: 14660
The macro UT_LIST_INIT() zero-initializes the UT_LIST_NODE.
There is no need to call this macro on a buffer that has
already been zero-initialized by mem_zalloc() or mem_heap_zalloc()
or similar.
For some reason, the statement UT_LIST_INIT(srv_sys->tasks) in
srv_init() caused a SIGSEGV on server startup when compiling with
GCC 7.1.0 for AMD64 using -O3. The zero-initialization was attempted
by the instruction movaps %xmm0,0x50(%rax), while the proper offset
of srv_sys->tasks would seem to have been 0x48.
Do not silence uncertain cases, or fix any bugs.
The only functional change should be that ha_federated::extra()
is not calling DBUG_PRINT to report an unhandled case for
HA_EXTRA_PREPARE_FOR_DROP.
Given the OK macro used in innodb does a DBUG_RETURN(1) on expression failure
the innodb implementation has a number of errors in i_s.cc.
We introduce a new macro BREAK_IF that replaces some use of the OK macro.
Also, do some other cleanup detailed below.
When invoking Field::store() on integers, always pass the parameter
is_unsigned=true to avoid an unnecessary conversion to double.
i_s_fts_deleted_generic_fill(), i_s_fts_config_fill():
Use the BREAK_IF macro instead of OK.
i_s_fts_index_cache_fill_one_index(), i_s_fts_index_table_fill_one_index():
Add a parameter for conv_string, and let the caller allocate that buffer.
i_s_fts_index_cache_fill(): Check the return status of
i_s_fts_index_cache_fill_one_index().
i_s_fts_index_table_fill(): Check the return status of
i_s_fts_index_table_fill_one_index().
i_s_fts_index_table_fill_one_fetch(): Always let the caller invoke
i_s_fts_index_table_free_one_fetch().
i_s_innodb_buffer_page_fill(), i_s_innodb_buf_page_lru_fill():
Do release dict_sys->mutex if filling the buffers fails.
i_s_innodb_buf_page_lru_fill(): Also display the value
INFORMATION_SCHEMA.INNODB_BUFFER_PAGE.PAGE_IO_FIX='IO_PIN'
when a block is in that state. Remove the unnecessary variable 'heap'.
simple_counter::add(): Add a type cast to the os_atomic_increment_ulint()
call, because GCC would check the type compatibility even when the code
branch is not being instantiated (atomic=false). On Solaris,
os_atomic_increment_ulint() actually needs a compatible parameter type,
and an error would be emitted due to an incompatible 64-bit type,
for srv_stats.n_lock_wait_time.add(diff_time).
There is a race condition related to the variable
srv_stats.n_lock_wait_current_count, which is only
incremented and decremented by the function lock_wait_suspend_thread(),
The incrementing is protected by lock_sys->wait_mutex, but the
decrementing does not appear to be protected by anything.
This mismatch could allow the counter to be corrupted when a
transactional InnoDB table or record lock wait is terminating
roughly at the same time with the start of a wait on a
(possibly different) lock.
ib_counter_t: Remove some unused methods. Prevent instantiation for N=1.
Add an inc() method that takes a slot index as a parameter.
single_indexer_t: Remove.
simple_counter<typename Type, bool atomic=false>: A new counter wrapper.
Optionally use atomic memory operations for modifying the counter.
Aligned to the cache line size.
lsn_ctr_1_t, ulint_ctr_1_t, int64_ctr_1_t: Define as simple_counter<Type>.
These counters are either only incremented (and we do not care about
losing some increment operations), or the increment/decrement operations
are protected by some mutex.
srv_stats_t::os_log_pending_writes: Document that the number is protected
by log_sys->mutex.
srv_stats_t::n_lock_wait_current_count: Use simple_counter<ulint, true>,
that is, atomic inc() and dec() operations.
lock_wait_suspend_thread(): Release the mutexes before incrementing
the counters. Avoid acquiring the lock mutex if the lock wait has
already been resolved. Atomically increment and decrement
srv_stats.n_lock_wait_current_count.
row_insert_for_mysql(), row_update_for_mysql(),
row_update_cascade_for_mysql(): Use the inc() method with the trx->id
as the slot index. This is a non-functional change, just using
inc() instead of add(1).
buf_LRU_get_free_block(): Replace the method add(index, n) with inc().
There is no slot index in the simple_counter.
This is a reduced version of an originally much larger patch.
We will keep the definition of the ulint, lint data types unchanged,
and we will not be replacing fprintf() calls with ib_logf().
On Windows, use the standard format strings instead of nonstandard
extensions.
This patch fixes some errors in format strings.
Most notably, an IMPORT TABLESPACE error message in InnoDB was
displaying the number of columns instead of the mismatching flags.
Allow 64-bit atomic operations on 32-bit systems,
only relying on HAVE_ATOMIC_BUILTINS_64, disregarding
the width of the register file.
Define UNIV_WORD_SIZE correctly on all systems, including Windows.
In MariaDB 10.0 and 10.1, it was incorrectly defined as 4 on
64-bit Windows.
Define HAVE_ATOMIC_BUILTINS_64 on Windows
(64-bit atomics are available on both 32-bit and 64-bit Windows
platforms; the operations were unnecessarily disabled even on
64-bit Windows).
MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES: Enable by default.
os_file_n_pending_preads, os_file_n_pending_pwrites,
os_n_pending_reads, os_n_pending_writes: Remove.
Use the monitor counters instead.
os_file_count_mutex: Remove. On a system that does not support
64-bit atomics, monitor_mutex will be used instead.
In the 10.1 InnoDB Plugin, a call os_event_free(buf_flush_event) was
misplaced. The event could be signalled by rollback of resurrected
transactions while shutdown was in progress. This bug was caught
by cmake -DWITH_ASAN testing. This call was only present in the
10.1 InnoDB Plugin, not in other versions, or in XtraDB.
That said, the bug affects all InnoDB versions. Shutdown assumes the
cessation of any page-dirtying activity, including the activity of
the background rollback thread. InnoDB only waited for the background
rollback to finish as part of a slow shutdown (innodb_fast_shutdown=0).
The default is a clean shutdown (innodb_fast_shutdown=1). In a scenario
where InnoDB is killed, restarted, and shut down soon enough, the data
files could become corrupted.
logs_empty_and_mark_files_at_shutdown(): Wait for the
rollback to finish, except if innodb_fast_shutdown=2
(crash-like shutdown) was requested.
trx_rollback_or_clean_recovered(): Before choosing the next
recovered transaction to roll back, terminate early if non-slow
shutdown was initiated. Roll back everything on slow shutdown
(innodb_fast_shutdown=0).
srv_innodb_monitor_mutex: Declare as static, because the mutex
is only used within one module.
After each call to os_event_free(), ensure that the freed event
is not reachable via global variables, by setting the relevant
variables to NULL.
Also, implement MDEV-11027 a little differently from 5.5:
recv_sys_t::report(ib_time_t): Determine whether progress should
be reported.
recv_apply_hashed_log_recs(): Rename the parameter to last_batch.
Provide more useful progress reporting of crash recovery.
recv_sys_t::progress_time: The time of the last report.
recv_scan_print_counter: Remove.
log_group_read_log_seg(): After after each I/O request,
report progress if needed.
recv_apply_hashed_log_recs(): At the start of each batch,
if there are pages to be recovered, issue a message.