dict0dict.cc
buf_LRU_drop_page_hash_for_tablespace(): Return whether any adaptive
hash index entries existed. If yes, the caller should keep retrying to
drop the adaptive hash index.
row_import_for_mysql(), row_truncate_table_for_mysql(),
row_drop_table_for_mysql(): Ensure that the adaptive hash index was
entirely dropped for the table.
Merge following change from 10.2
revision-id: d52cff9f10aeea208a1058f7b5527e602125584c (mariadb-10.2.14-25-gd52cff9)
parent(s): bc2501453c
author: Sachin Setiya
committer: Sachin Setiya
timestamp: 2018-04-04 12:26:06 +0530
message:
MDEV-15611 Due to the failure of foreign key detection, Galera...
slave node killed himself.
Problem:- If we try to delete table with foreign key and table whom it is
referring with wsrep_slave_threads>1 then galera tries to execute both
Delete_rows_log-event in parallel, which should not happen.
Solution:- This is happening because we do not have foreign key info in
write set. Upto version 10.2.7 it used to work fine. Actually it happening
because of issue in commit 2f342c4. wsrep_must_process_fk should be used
with negation.
fil_page_decompress(): Replaces fil_decompress_page().
Allow the caller detect errors. Remove
duplicated code. Use the "safe" instead of "fast" variants of
decompression routines.
fil_page_compress(): Replaces fil_compress_page().
The length of the input buffer always was srv_page_size (innodb_page_size).
Remove printouts, and remove the fil_space_t* parameter.
buf_tmp_buffer_t::reserved: Make private; the accessors acquire()
and release() will use atomic memory access.
buf_pool_reserve_tmp_slot(): Make static. Remove the second parameter.
Do not acquire any mutex. Remove the allocation of the buffers.
buf_tmp_reserve_crypt_buf(), buf_tmp_reserve_compression_buf():
Refactored away from buf_pool_reserve_tmp_slot().
buf_page_decrypt_after_read(): Make static, and simplify the logic.
Use the encryption buffer also for decompressing.
buf_page_io_complete(), buf_dblwr_process(): Check more failures.
fil_space_encrypt(): Simplify the debug checks.
fil_space_t::printed_compression_failure: Remove.
fil_get_compression_alg_name(): Remove.
fil_iterate(): Allocate a buffer for compression and decompression
only once, instead of allocating and freeing it for every page
that uses compression, during IMPORT TABLESPACE.
fil_node_get_space_id(), fil_page_is_index_page(),
fil_page_is_lzo_compressed(): Remove (unused code).
When attempting to rename a table to a non-existing database,
InnoDB would misleadingly report "OS error 71" when in fact the
error code is InnoDB's own (OS_FILE_NOT_FOUND), and not report
both pathnames. Errors on rename could occur due to reasons
connected to either pathname.
os_file_handle_rename_error(): New function, to report errors in
renaming files.
fil_iterate(): Invoke fil_encrypt_buf() correctly when
a ROW_FORMAT=COMPRESSED table with a physical page size of
innodb_page_size is being imported. Also, validate the page checksum
before decryption, and reduce the scope of some variables.
AbstractCallback::operator()(): Remove the parameter 'offset'.
The check for it in FetchIndexRootPages::operator() was basically
redundant and dead code since the previous refactoring.
InnoDB insisted on closing the file handle before renaming a file.
Renaming a file should never be a problem on POSIX systems. Also on
Windows it should work if the file was opened in FILE_SHARE_DELETE
mode.
fil_space_t::stop_ios: Remove. We no longer need to stop file access
during rename operations.
fil_mutex_enter_and_prepare_for_io(): Remove the wait for stop_ios.
fil_rename_tablespace(): Remove the retry logic; do not close the
file handle. Remove the unused fault injection that was added along
with the DATA DIRECTORY functionality (MySQL WL#5980).
os_file_create_simple_func(), os_file_create_func(),
os_file_create_simple_no_error_handling_func(): Include FILE_SHARE_DELETE
in the share_mode. (We will still prevent multiple InnoDB instances
from using the same files by not setting FILE_SHARE_WRITE.)
ha_innobase::optimize(): If both innodb_defragment and
innodb_optimize_fulltext_only are at their default settings (OFF),
fall back to ALTER TABLE. Else process one or both options.
ha_innobase::set_partition_owner_stats(): Remove (unused function).
ha_innobase::ha_partition_stats: Remove (the variable is never read).
Remove unused ut_timer functions.
Also fixes MDEV-14727, MDEV-14491
InnoDB: Error: Waited for 5 secs for hash index ref_count (1) to drop to 0
by replacing the flawed wait logic in dict_index_remove_from_cache_low().
On DISCARD TABLESPACE, there is no need to drop the adaptive hash index.
We must drop it on IMPORT TABLESPACE, and eventually on DROP TABLE or
DROP INDEX. As long as the dict_index_t object remains in the cache
and the table remains inaccessible, the adaptive hash index entries
to orphaned pages would not do any harm. They would be dropped when
buffer pool pages are reused for something else.
btr_search_drop_page_hash_when_freed(), buf_LRU_drop_page_hash_batch():
Remove the parameter zip_size, and pass 0 to buf_page_get_gen().
buf_page_get_gen(): Ignore zip_size if mode==BUF_PEEK_IF_IN_POOL.
buf_LRU_drop_page_hash_for_tablespace(): Drop the adaptive hash index
even if the tablespace is inaccessible.
buf_LRU_drop_page_hash_for_tablespace(): New global function, to drop
the adaptive hash index.
buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Remove the parameter drop_ahi.
dict_index_remove_from_cache_low(): Actively drop the adaptive hash index
if entries exist. This should prevent InnoDB hangs on DROP TABLE or
DROP INDEX.
row_import_for_mysql(): Drop any adaptive hash index entries for the table.
row_drop_table_for_mysql(): Drop any adaptive hash index for the table,
except if the table resides in the system tablespace. (DISCARD TABLESPACE
does not apply to the system tablespace, and we do no want to drop the
adaptive hash index for other tables than the one that is being dropped.)
row_truncate_table_for_mysql(): Drop any adaptive hash index entries for
the table, except if the table resides in the system tablespace.
When the transaction isolation level is SERIALIZABLE, or when
a locking read is performed in the REPEATABLE READ isolation level,
InnoDB must lock delete-marked records in order to prevent another
transaction from inserting something.
However, at READ UNCOMMITTED or READ COMMITTED isolation level or
when the parameter innodb_locks_unsafe_for_binlog is set, the
repeatability of the reads does not matter, and there is no need
to lock any records.
row_search_for_mysql(): Skip locks on delete-marked committed records
upfront, instead of invoking row_unlock_for_mysql() afterwards.
The unlocking never worked for secondary index records.
Problem:
When FTS index is added into a table which doesn't have 'FTS_DOC_ID'
column, Innodb rebuilds table to add column 'FTS_DOC_ID'. when this FTS
index is dropped from this table. Innodb doesn't not rebuild table to
remove 'FTS_DOC_ID' column and deletes FTS index auxiliary tables.
But it doesn't delete FTS common auxiliary tables.
Later when the database having this table is renamed, FTS auxiliary
tables are not renamed because table's flags2 (dict_table_t.flags2)
has been resetted for DICT_TF2_FTS flag during FTS index drop operation.
Now when we drop old database, it leads to an assert.
Fix:
During renaming of FTS auxiliary tables, ORed a condition to check if
table has DICT_TF2_FTS_HAS_DOC_ID flag set.
RB: 18769
Reviewed by : Jimmy.Yang@oracle.com
Problem:
=======
Multiple insert statement in table contains FULLTEXT KEY and a
FTS_DOC_ID column aborts the server if the FTS_DOC_ID exceeds
FTS_DOC_ID_MAX_STEP.
Solution:
========
Remove the exception for first committed insert statement.
Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 18023
When Oracle fixed MDEV-13899 in their own way, they moved the
condition to the only caller of PageConverter::update_records().
Thus, the merge of 5.6.40 into MariaDB added a redundant condition.
PageConverter::update_records(): Move the page_is_leaf() condition
to the only caller, PageConverter::update_index_page().
The problem is hard to repeat, and I failed to create a deterministic
test case. Online index creation creates stubs for to-be-created indexes.
If index creation fails, we could remove these stubs while locks exist
in the indexes. (This would require that the index creation was completed,
and a concurrent DML operation acquired a lock on a record in the
uncommitted index. If a duplicate key error occurs in an uncommitted
index, the error will be reported for the CREATE UNIQUE INDEX, not for
the DML operation that tried to insert the duplicate.)
dict_table_try_drop_aborted(), row_merge_drop_indexes(): If transactional
locks exist on the table, keep the table->indexes intact.
ha_innobase::commit_inplace_alter_table(): Defer the freeing of ctx->trx
until after the operation has been successfully committed. In this way,
rollback on a partitioned table will be possible.
rollback_inplace_alter_table(): Handle ctx->new_table == NULL when
ctx->trx != NULL.
If the tablespace is dropped or truncated after the
space->is_stopping() check in fil_crypt_get_page_throttle_func(),
we would proceed to request the page, and eventually report a fatal
error.
buf_page_get_gen(): Do not retry reading if mode==BUF_GET_POSSIBLY_FREED.
lock_rec_block_validate(): Be prepared for a NULL return value when
invoking buf_page_get_gen() with mode=BUF_GET_POSSIBLY_FREED.
Issue:
------
Prefix for externally stored columns were being stored in online_log when a
table is altered and alter causes table to be rebuilt. Space in online_log is
limited and if length of prefix of externally stored columns is very big, then
it is being written to online log without making sure if it fits. This leads to
memory corruption.
Fix:
----
After fix for Bug#16544143, there is no need to store prefixes of externally
stored columnd in online_log. Thus remove the code which stores column prefixes
for externally stored columns. Also, before writing anything on online_log,
make sure it fits to available memory to avoid memory corruption.
Read RB page for more details.
Reviewed-by: Annamalai Gurusami <annamalai.gurusami@oracle.com>
RB: 18239
In commit e7f4e61f6e
the call fil_flush_file_spaces(FIL_LOG) is necessary.
Tablespaces will be flushed as part of the redo log
checkpoint, but the redo log will not necessarily
be flushed, depending on innodb_flush_method.
These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
dict_load_table_low(): When flagging an error, assign *table = NULL.
Failure to do so could cause a crash if an error was flagged when
accessing INFORMATION_SCHEMA.INNODB_SYS_TABLES.