Problem:
========
- Drop table assert if innodb_force_recovery is set to 5 or 6.
For innodb_force_recovery 5 and 6, InnoDB doesn't scan the undo log
and it makes the redo rollback segment as NULL. There is no way for
transaction to write any undo log.
- If innodb_force_recovery is set to 6 then InnoDB does not do the
redo log roll-forward in connection with recovery. In this case,
log_sys will be initalized only and it will not have latest
checkpoint information. Checkpoint is done during shutdown even
innodb_force_recovery is set to 6. So it leads to incorrect
information update in checkpoint header.
Solution:
========
1) Allow drop table only if innodb_force_recovery < 5.
2) Make innodb as read-only if innodb_force_recovery is set to 6.
3) During shutdown, remove the checkpoint if innodb_force_recovery
is set to 6.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 15075
Problem:
========
During checkpoint, we are writing all MLOG_FILE_NAME records in one mtr
and parse buffer can't be processed till MLOG_MULTI_REC_END. Eventually parse
buffer exceeds the RECV_PARSING_BUF_SIZE and eventually it overflows.
Fix:
===
1) Break the large mtr if it exceeds LOG_CHECKPOINT_FREE_PER_THREAD into multiple mtr during checkpoint.
2) Move the parsing buffer if we are encountering only MLOG_FILE_NAME
records. So that it will never exceed the RECV_PARSING_BUF_SIZE.
Reviewed-by: Debarun Bannerjee <debarun.bannerjee@oracle.com>
Reviewed-by: Rahul M Malik <rahul.m.malik@oracle.com>
RB: 14743
Description:
===========
Add my_thread_init() and my_thread_exit() for background threads which
initializes and frees the st_my_thread_var structure.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 15003
Problem:
During read head, wrong page size is used to calcuate the tablespace size.
Fix:
Use physical page size to calculate tablespace size
Reveiwed-By: Satya Bodapati
RB: 14993
Split the test case so that a server restart is not needed.
Reduce the test cases and use a simpler mechanism for triggering
and waiting for purge.
fil_table_accessible(): Check if a table can be accessed without
enjoying MDL protection.
PROBLEM
When truncating single tablespace tables, we need to scan the entire
buffer pool to remove the pages of the table from the buffer pool.
During this scan and removal dict_sys->mutex is being held ,causing
stalls in other DDL operations.
FIX
Release the dict_sys->mutex during the scan and reacquire it after the
scan. Make sure that purge thread doesn't purge the records of the table
being truncated and background stats collection thread skips the updation
of stats for the table being truncated.
[#rb 14564 Approved by Jimmy and satya ]
srv_sys_t::n_threads_active[]: Protect writes by both the mutex and
by atomic memory access.
srv_active_wake_master_thread_low(): Reliably wake up the master
thread if there is work to do. The trick is to atomically read
srv_sys->n_threads_active[].
srv_wake_purge_thread_if_not_active(): Atomically read
srv_sys->n_threads_active[] (and trx_sys->rseg_history_len),
so that the purge should always be triggered when there is work to do.
trx_commit_in_memory(): Invoke srv_wake_purge_thread_if_not_active()
whenever a transaction is committed. Purge could have been prevented by
the read view of the currently committing transaction, even if it is
a read-only transaction.
trx_purge_add_update_undo_to_history(): Do not wake up the purge.
This is only called by trx_undo_update_cleanup(), as part of
trx_write_serialisation_history(), which in turn is only called by
trx_commit_low() which will always call trx_commit_in_memory().
Thus, the added call in trx_commit_in_memory() will cover also
this use case where a committing read-write transaction added
some update_undo log to the purge queue.
trx_rseg_mem_restore(): Atomically modify trx_sys->rseg_history_len.
Problem:
=======
Concurrent update dml statement doesn't reflect in virtual index during
inplace table rebuild. It results mismatch value in virutal index and
clustered index. Deleting the table content tries to search the mismatch
value in virtual index but it can't find the value. During log update
apply phase, virtual information is being ignored while constructing
the new entry.
Solution:
=========
In row_log_update_apply phase, build the entry with virtual column
information. So that it can reflect in newly constructed virtual index.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 14974
Issue:
======
Disabling macros such as UNIV_PFS_MUTEX/UNIV_PFS_RWLOCK/UNIV_PFS_THREAD
which are defined in InnoDB throws errors during compilation.
Fix:
====
Fix all the compilation errors.
RB: 14893
Reviewed-by: Jimmy Yang <Jimmy.Yang@oracle.com>
Reviewed-by: Satya Bodapati <satya.bodapati@oracle.com>
Issue:
======
The issue is that if a fts index is present in a table the space size is
incorrectly calculated in the case of truncate which results in a invalid
read.
Fix:
====
Have a different space size calculation in truncate if fts indexes are
present.
RB:14755
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Analysis:
========
There was missing bracket for IF conditon in dict_stats_analyze_index_level()
and it leads to wrong result.
Fix:
====
Fix the IF condition in dict_stats_analyze_index_level() so that it satisfied
the if condtion only if level is zero.
Reviewed-by : Jimmy Yang <jimmy.yang@oracle.com>
Analysis:
========
Field name comparison happens while filling the virtual columns
affected by foreign constraint. But field name is NULL in virtual
index for the newly added virtual column.
Fix:
===
Ignore the index if it has newly added virtual column. Foreign
key affected virtual column information is filled during
loading operation.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 14895
Problem :
---------
Information_Schema.referential_constraints (UNIQUE_CONSTRAINT_NAME)
shows NULL for a foreign key constraint after restarting the server.
If any dml or query (select/insert/update/delete) is done on
referenced table, then the constraint name is correctly shown.
Solution :
----------
UNIQUE_CONSTRAINT_NAME column is the key name of the referenced table.
In innodb, FK reference is stored as a list of columns in referenced
table in INNODB_SYS_FOREIGN and INNODB_SYS_FOREIGN_COLS. The referenced
column must have at least one index/key with the referenced column as
prefix but the key name itself is not included in FK metadata. For this
reason, the UNIQUE_CONSTRAINT_NAME is only filled up when the
referenced table is actually loaded in innodb dictionary cache.
The information_schema view calls handler::get_foreign_key_list() on
foreign key table to read the FK metadata. The UNIQUE_CONSTRAINT_NAME
information shows NULL based on whether the referenced table is
already loaded or not.
One way to fix this issue is to load the referenced table while reading
the FK metadata information, if needed.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
RB: 14654
Problem :
---------
This bug is filed from the base replication bug#25040331 where the
slave thread times out while INSERT operation waits on GAP lock taken
during Foreign Key validation.
The primary reason for the lock wait is because the statements are
getting replayed in different order. However, we also observed
two things ...
1. The slave thread could always use "Read Committed" isolation for
row level replication.
2. It is not necessary to have GAP locks in "READ Committed" isolation
level in innodb.
This bug is filed to address point(2) to avoid taking GAP locks during
Foreign Key validation.
Solution :
----------
Innodb is primarily designed for "Repeatable Read" and the GAP lock
behaviour is default. For "Read Committed" isolation, we have special
handling in row_search_mvcc to avoid taking the GAP lock while
scanning records.
While looking for Foreign Key, the code is following the default
behaviour taking GAP locks. The suggested fix is to avoid GAP
locking during FK validation similar to normal search operation
(row_search_mvcc) for "Read Committed" isolation level.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
RB: 14526
Problem :
---------
1. delete_all_rows() and rnd_init() are not returning error
after async rollback in 5.7. This results in assert in
innodb in next call.
2. High priority transaction is rolling back prepared transaction.
This is because TRX_FORCE_ROLLBACK_DISABLE is getting set only for
first entry [TrxInInnoDB].
Solution :
----------
1. return DB_FORCED_ABORT error after rollback.
2. check and disable rollback in TrxInInnodb::enter always.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
RB: 13777
Problem: Some instantiations of std::map have discrepancies between
the value_type of the map and the value_type of the map's allocator.
On FreeBSD 11 this is detected by Clang, and an error is raised at
compilation time.
Fix: Specify the correct value_type for the allocators.
Also fix an unused variable warning in storage/innobase/os/os0file.cc.
Prevent GCC from moving a mach_read_from_4() before we have checked that
we have 4 bytes to read. The pointer may only point to a 1, 2 or 3
bytes in which case the code should not read 4 bytes. This is a
workaround to a GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77673
Patch submitted by: Laurynas Biveinis <laurynas.biveinis@gmail.com>
RB: 14135
Reviewed by: Pawel Olchawa <pawel.olchawa@oracle.com>
Description
===========
Under heavy load, the aysnchronous Windows file IO API can return a
failure code that is handled in MySQL Server by retrying the file IO operation.
A cast necessary for the correct operation of the retry path in a 64 bit
build is missing, leading to the file IO retry result being misinterpreted
and ultimately the report of the OS error number 995
(ERROR_OPERATION_ABORTED) in the MySQL error log.
Fix
===
Supply the missing cast.
Reviewed-by: Sunny Bains <sunny.bains@oracle.com>
RB: 14109
Avoid detaching and exiting from threads that may finish before the
caller has returned from pthread_create(). Only exit from such threads,
without detach and join with them later.
Patch submitted by: Laurynas Biveinis <laurynas.biveinis@gmail.com>
RB: 13983
Reviewed by: Sunny Bains <sunny.bains@oracle.com>
Simplify the tests that are present in MySQL 5.7. Make the table
smaller while generating enough undo log. Do not unnecessarily
drop tables.
trx_purge_initiate_truncate(): Remove two crash injection points
(before and after normal redo log checkpoint), because they are
not adding any value. Clarify some messages.
trx_sys_create_rsegs(): Display the number of active undo tablespaces.
srv_undo_tablespaces_init(): When initializing the data files, do not
leave srv_undo_tablespaces_active at 0.
Do not display that number; let trx_sys_create_rsegs() display it once
the final number is known.
innodb_params_adjust(): Adjust parameters after startup.
innobase_init(): Do not allow innodb_max_undo_size to be less
than SRV_UNDO_TABLESPACE_SIZE_IN_PAGES. This avoids unnecessary
repeated truncation of undo tablespaces when using
innodb_page_size=32k or innodb_page_size=64k.
Problem:
========
If we increase innodb_undo_logs value during startup. New rollback
segment might get allocated during trx_sys_create_rseg(). Essentially, it
would make these tablesapces active with transaction undo data and purge.
But it doesn't come in active undo tablespaces. so these tablespace would
never get truncated.
Fix:
===
Increase the number of active undo tablespace when we are assigning the
undo tablespace to the newly assigned rollback segment.
Reviewed-by: Kevin Lewis <kevin.lewis@oracle.com>
Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
RB: 13746
MySQL 5.7 reduced the maximum number of innodb_undo_tablespaces
from 126 to 95 when it reserved 32 persistent rollback segments
for the temporary undo logs. Since MDEV-12289 restored all 128
persistent rollback segments for persistent undo logs, the
reasonable maximum value of innodb_undo_tablespaces is 127
(not 126 or 95). This is because out of the 128 rollback segments,
the first one will always be created in the system tablespace
and the remaining ones can be created in dedicated undo tablespaces.
Silence the Valgrind warnings on instrumented builds (-DWITH_VALGRIND).
assert_block_ahi_empty_on_init(): A variant of
assert_block_ahi_empty() that declares n_pointers initialized and then
asserts that n_pointers==0.
In Valgrind-instrumented builds, InnoDB declares allocated memory
uninitialized.
btr_search_drop_page_hash_index(): Do not return before ensuring
that block->index=NULL, even if !btr_search_enabled. We would
typically still skip acquiring the AHI latch when the AHI is
disabled, because block->index would already be NULL. Only if the AHI
is in the process of being disabled, we would wait for the AHI latch
and then notice that block->index=NULL and return.
The above bug was a regression caused in MySQL 5.7.9 by the fix of
Bug#21407023: DISABLING AHI SHOULD AVOID TAKING AHI LATCH
The rest of this patch improves diagnostics by adding assertions.
assert_block_ahi_valid(): A debug predicate for checking that
block->n_pointers!=0 implies block->index!=NULL.
assert_block_ahi_empty(): A debug predicate for checking that
block->n_pointers==0.
buf_block_init(): Instead of assigning block->n_pointers=0,
assert_block_ahi_empty(block).
buf_pool_clear_hash_index(): Clarify comments, and assign
block->n_pointers=0 before assigning block->index=NULL.
The wrong ordering could make block->n_pointers appear incorrect in
debug assertions. This bug was introduced in MySQL 5.1.52 by
Bug#13006367 62487: INNODB TAKES 3 MINUTES TO CLEAN UP THE
ADAPTIVE HASH INDEX AT SHUTDOWN
i_s_innodb_buffer_page_get_info(): Add a comment that
the IS_HASHED column in the INFORMATION_SCHEMA views
INNODB_BUFFER_POOL_PAGE and INNODB_BUFFER_PAGE_LRU may
show false positives (there may be no pointers after all.)
ha_insert_for_fold_func(), ha_delete_hash_node(),
ha_search_and_update_if_found_func(): Use atomics for
updating buf_block_t::n_pointers. While buf_block_t::index is
always protected by btr_search_x_lock(index), in
ha_insert_for_fold_func() the n_pointers-- may belong to
another dict_index_t whose btr_search_latches[] we are not holding.
RB: 13879
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Analysis:
========
A foreign key constraint cannot reference a secondary index defined
on a generated virtual column. While adding new index/drop existing
column, server internally drops the internal foreign key index and
it leads to choose the virtual secondary index as foreign key index.
But innodb doesn't allow foreign key constraint reference to
secondary virtual index.
Fix:
===
Allow foreign key constraint refer to secondary index defined on
a generated virutal column.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 13586
This bug was introduced in MySQL 5.6.8 with WL#6255.
When an error occurs while rebuilding a table that only has a
hidden GEN_CLUST_INDEX inside InnoDB, ha_alter_info->key_info_buffer
would be invalid and should not be dereferenced.
get_error_key_name(): Get the name of an erroneous key.
Avoid dereferencing ha_alter_info->key_info_buffer when no keys
exist in the SQL layer.
ha_innobase::inplace_alter_table(),
ha_innobase::commit_try_rebuild(): Invoke get_error_key_name()
for reporting ER_INNODB_ONLINE_LOG_TOO_BIG or ER_INDEX_CORRUPT.
RB: 13834
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
PROBLEM
By design stats estimation always reading uncommitted data. In this scenario
an uncommitted transaction has deleted all rows in the table. In Innodb
uncommitted delete records are marked as delete but not actually removed
from Btree until the transaction has committed or a read view for the rows
is present.While calculating persistent stats we were ignoring the delete
marked records,since all the records are delete marked we were estimating
the number of rows present in the table as zero which leads to bad plans
in other transaction operating on the table.
Fix
Introduced a system variable called innodb_stats_include_delete_marked
which when enabled includes delete marked records for stat
calculations .
Apparently, WL#8149 QA did not cover the code changes made to
online table rebuild (which was introduced in MySQL 5.6.8 by WL#6255)
for ROW_FORMAT=REDUNDANT tables.
row_log_table_low_redundant(): Log the new values of indexed virtual
columns (ventry) only once.
row_log_table_low(): Assert that if o_ventry is specified, the
logged operation must not be ROW_T_INSERT, and ventry must be specified
as well.
row_log_table_low(): When computing the size of old_pk, pass v_entry=NULL to
rec_get_converted_size_temp(), to be consistent with the subsequent call
to rec_convert_dtuple_to_temp() for logging old_pk. Assert that
old_pk never contains information on virtual columns, thus proving that this
change is a no-op.
RB: 13822
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Problem:
=======
Autoincrement value gives duplicate values because of the following reasons.
(1) In InnoDB handler function, current autoincrement value is not changed
based on newly set auto_increment_increment or auto_increment_offset variable.
(2) Handler function does the rounding logic and changes the current
autoincrement value and InnoDB doesn't aware of the change in current
autoincrement value.
Solution:
========
Fix the problem(1), InnoDB always respect the auto_increment_increment
and auto_increment_offset value in case of current autoincrement value.
By fixing the problem (2), handler layer won't change any current
autoincrement value.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13748
When a table has no PRIMARY KEY, but there is a UNIQUE INDEX
defined on NOT NULL columns that are not column prefixes,
that unique index must be treated as the primary key.
This property was being violated by InnoDB when a column was changed
to NOT NULL, such that a UNIQUE INDEX on that column became eligible
to being treated as a primary key.
innobase_create_key_defs(): Instead of checking each ADD [UNIQUE] INDEX
request, check if a GEN_CLUST_INDEX can be replaced with any unique index
in the altered_table definition. So, we can have new_primary even
if n_add==0.
prepare_inplace_alter_table_dict(): When the table is not being rebuilt,
assert that TABLE_SHARE::primary_key is not changing.
RB: 13595
Reviewed-by: Kevin Lewis <kevin.lewis@oracle.com>
Problem:
=======
High priority transaction can't able to kill the blocking transaction
when foreign keys are involved. trx_kill_blocking() missing while checking
the foreign key constraint.
Fix:
===
Add trx_kill_blocking() while checking for the foreign key constraint.
Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
RB: 13579
Analysis: In row_log_table_delete(), extern size could be greater
than 2 bytes int if there are enough index on blob columns.
Solution: Use 4 bytes int other than 2 bytes for extern size.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
RB: 13573
BUG#23742339 FAILING ASSERTION: SYM_NODE->TABLE != NULL
Analysis: When we access fts aux tables in information schema,the
fts aux tables are dropped by DROP DATABASE in another session.
Solution: Drop parent table if it's a fts aux table, and drop
table will drop fts aux tables together.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13264
Analysis:
the old table is dropped, just after it's added into drop list,
and new table with the same name is created, then we try to drop
the new table in background.
Solution:
Don't drop a table in background if table->to_be_dropped is false.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13414
Analysis:
When we access fts_internal_tbl_name in i_s_fts_config_fill (),
it can be set to NULL by another session.
Solution:
Define fts_internal_tbl_name2 for global variable innodb_ft_aux_table,
if it's NULL, set fts_internal_tbl_name to "default".
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13401
Problem:
=======
Inplace alter algorithm determines the table to be rebuild if the table
undergoes row format change, key block size if handler flag contains only
change table create option. If alter with inplace ignore flag operations and change table create options then it leads to table rebuild operation.
Solution:
========
During the check for rebuild, ignore the inplace ignore flag and check for
table create options.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Marko Makela <marko.makela@oracle.com>
RB: 13172
buf_flush_write_block_low(): Acquire the tablespace reference once,
and pass it to lower-level functions. This is only a start; further
calls may be removed.
fil_decompress_page(): Remove unsafe use of fil_space_get_by_id().
fil_crypt_thread(): Do invoke fil_crypt_complete_rotate_space()
when the tablespace is about to be dropped. Also, remove a redundant
check whether rotate_thread_t::space is NULL. It can only become
NULL when fil_crypt_find_space_to_rotate() returns false, and in
that case we would already have terminated the loop.
fil_crypt_find_page_to_rotate(): Remove a redundant check for
space->crypt_data == NULL. Once encryption metadata has been
created for a tablespace, it cannot be removed without dropping
the entire tablespace.