Problem was that 4k page size is not really supported in
Galera. For reference see:
codership/galera#398
Page size 4k is problematic because WSREP XID info location
that was set to constant UNIV_PAGE_SIZE - 3500 and that is conflicting
with rseg undo slots location if there is lot of undo tablespaces.
Undo tablespace identifiers and page numbers require
at least 126*8=1024 bytes starting from offset 56. Therefore,
WSREP XID startig from offset 596 would overwrite several
space_id,page_no pairs starting from 72th undo log tablespace
space_id,page_no pair at offset 594.
This will cause InnoDB startup failure seen as
[ERROR] InnoDB: Unable to open undo tablespace './undo30579'.
Originally, the undo tablespace ID would always be between
0 and 127. Starting with MySQL 5.6.36 which introduced
Bug #25551311 BACKPORT BUG #23517560 REMOVE SPACE_ID RESTRICTION
FOR UNDO TABLESPACES (merged to MariaDB 10.0.31)
it is possible for an undo tablespace ID to be 0x7773. But in
this case, the page number should be 3, not 0x72650003.
This is just the first collision. The WSREP XID data would
overwrite subsequent slots.
trx0sys.h
trx0sys.cc
Code formatting and add comments.
Following merge from 5.6.36, this merge also rejects changes that
collided with the rejection of 6ca4f693c1ce472e2b1bf7392607c2d1124b4293.
We initially rejected 6ca4f693c1ce472e2b1bf7392607c2d1124b4293 because
it was introducing a new storage engine API method.
Problem was that dict_sys->size tries to maintain used memory
occupied by the data dictionary table and index objects.
However at least on table objects table->heap size can increase
between when table object is inserted to dict_sys and when
it is removed from dict_sys causing inconsistency on amount
of memory added to and removed from dict_sys->size variable.
Removed unnecessary dict_sys:size variable as it is really
used only for status output.
Introduced dict_sys_get_size function to calculate memory
occupied by the data dictionary table and index objects
that is then used on show engine innodb output.
dict_table_add_to_cache(),
dict_table_rename_in_cache(),
dict_table_remove_from_cache_low(),
dict_index_remove_from_cache_low(),
Remove size calculation.
srv_printf_innodb_monitor(): Use dict_sys_get_size function to
get dictionary memory allocated.
xtradb_internal_hash_tables_fill_table(): Use dict_sys_get_size
function to get dictionary memory allocated.
end_io_call uses uninitialized values from the new_data_cache
As such we the buffer 0 and check this before calling end_io_cache on it.
Thanks Sergey Vojtovich for the review and for this solution.
Found by Coverity (ref 972481).
Coverity report this as:
CID 971840 (#1 of 1): Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
result_independent_of_operands: 4 | (flags & 1) is always true regardless of the values of its operands. This occurs as the logical first operand of "?:".
The C order of precidence has | of higher precidence than ?:. The
intenting implies an | of the 3 terms.
Adjust to intented meaning.
log_calc_max_ages(): Use the requested size in the check, instead of
the detected redo log size. The redo log will be resized at startup
if it differs from what has been requested.
in innodb_read_only mode.
The reason for the hang is that there was no notification received about
completed read io. File handles are bound to completion_port, and there
were no background "write" threads that would be waiting on completion_port,
only 2 "read" threads waiting on read_completion_port were active.
The fix is to use a single IO completion port for all IOs, if
innodb_read_only is set.
When the server is started in innodb_read_only mode, there cannot be
any writes to persistent InnoDB/XtraDB files. Just like the creation
of buf_flush_page_cleaner_thread is skipped in this case, also
the creation of the XtraDB-specific buf_flush_lru_manager_thread
should be skipped.
When a slow shutdown is performed soon after spawning some work for
background threads that can create or commit transactions, it is possible
that new transactions are started or committed after the purge has finished.
This is violating the specification of innodb_fast_shutdown=0, namely that
the purge must be completed. (None of the history of the recent transactions
would be purged.)
Also, it is possible that the purge threads would exit in slow shutdown
while there exist active transactions, such as recovered incomplete
transactions that are being rolled back. Thus, the slow shutdown could
fail to purge some undo log that becomes purgeable after the transaction
commit or rollback.
srv_undo_sources: A flag that indicates if undo log can be generated
or the persistent, whether by background threads or by user SQL.
Even when this flag is clear, active transactions that already exist
in the system may be committed or rolled back.
innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Do not return an error code; the operation never fails.
Clear the srv_undo_sources flag, and also ensure that the background
DROP TABLE queue is empty.
srv_purge_should_exit(): Do not allow the purge to exit if
srv_undo_sources are active or the background DROP TABLE queue is not
empty, or in slow shutdown, if any active transactions exist
(and are being rolled back).
srv_purge_coordinator_thread(): Remove some previous workarounds
for this bug.
innobase_start_or_create_for_mysql(): Set buf_page_cleaner_is_active
and srv_dict_stats_thread_active directly. Set srv_undo_sources before
starting the purge subsystem, to prevent immediate shutdown of the purge.
Create dict_stats_thread and fts_optimize_thread immediately
after setting srv_undo_sources, so that shutdown can use this flag to
determine if these subsystems were started.
dict_stats_shutdown(): Shut down dict_stats_thread. Backported from 10.2.
srv_shutdown_table_bg_threads(): Remove (unused).
InnoDB shutdown assumes that once the server has entered
SRV_SHUTDOWN_FLUSH_PHASE, no change to persistent data is allowed.
It was possible for the master thread to wake up while shutdown
is executing in SRV_SHUTDOWN_FLUSH_PHASE or
even in SRV_SHUTDOWN_LAST_PHASE.
We do not yet know if further crashes at shutdown are possible.
Also, we do not know if all the observed crashes could be explained
by the race conditions that we are now fixing.
srv_shutdown_print_master_pending(): Remove a redundant ut_time() call.
srv_shutdown(): Renamed from srv_master_do_shutdown_tasks().
srv_master_thread(): Do not resume after shutdown has been initiated.