Commit graph

1350 commits

Author SHA1 Message Date
Aleksey Midenkov
2347ffd843 MDEV-20301 InnoDB's MVCC has O(N^2) behaviors
If there're multiple row versions in InnoDB, reading one row from PK
may have O(N) complexity and reading from secondary keys may have
O(N^2) complexity.

The problem occurs when there are many pending versions of the same
row, meaning that the primary key is the same, but a secondary key is
different.  The slowdown occurs when the secondary index is
traversed. This patch creates a helper class for the function
row_sel_get_clust_rec_for_mysql() which can remember and re-use
cached_clust_rec & cached_old_vers so that rec_get_offsets() does not
need to be called over and over for the clustered record.

Corrections by Kevin Lewis <kevin.lewis@oracle.com>

MDEV-20341 Unstable innodb.innodb_bug14704286

Removed test that tested the ability of interrupting long query which
is not long anymore.
2019-08-14 19:10:17 +03:00
Marko Mäkelä
609ea2f37b MDEV-17614: After-merge fix
MDEV-17614 flags INSERT…ON DUPLICATE KEY UPDATE unsafe for statement-based
replication when there are multiple unique indexes. This correctly fixes
something whose attempted fix in MySQL 5.7
in mysql/mysql-server@c93b0d9a97
caused lock conflicts. That change was reverted in MySQL 5.7.26
in mysql/mysql-server@066b6fdd43
(with a substantial amount of other changes).

In MDEV-17073 we already disabled the unfortunate MySQL change when
statement-based replication was not being used. Now, thanks to MDEV-17614,
we can actually remove the change altogether.

This reverts commit 8a346f31b9 (MDEV-17073)
and mysql/mysql-server@c93b0d9a97 while
keeping the test cases.
2019-08-12 18:50:54 +03:00
Marko Mäkelä
a7e9395f9d fts_sync_table(), fts_sync() dead code removal
fts_sync(): Remove the constant parameter has_dict=false.

fts_sync_table(): Remove the constant parameter has_dict=false,
and the redundant parameter unlock_cache = !wait.
Make wait=true the default parameter.
2019-07-25 13:34:36 +03:00
Thirunarayanan Balathandayuthapani
8fb39b2c35 MDEV-19870 gcol.innodb_virtual_debug_purge doesn't fail if row_vers_old_has_index_entry gives wrong result
1) Whenever purge thread tries to remove the secondary virtual index
entry, purge thread acquires metadata lock for the table and release
dict_operation_lock. After that, it retries the secondary index
deletion if MDL acquired successfully.

2) Inside row_vers_old_has_index_entry(), Change the safe_to_purge
to unsafe_to_purge goto statement. So it can be more appropriate to
return true if it is unsafe_to_purge.

3) Previously, row_vers_old_has_index_entry() returns false if InnoDB
fetched the MDL on the table for the first time. This check(two cases)
should checked only during purge thread. In row_purge_poss_sec(), again
InnoDB checks whether the MDL fetched for the first time. If it is then
InnoDB retry the secondary index deletion logic. So in that case,
InnoDB have to clean up the memory used inside row_vers_old_has_index_entry()
and shouldn't care about return value.
2019-07-24 16:45:05 +05:30
Marko Mäkelä
60c790d6f4 Merge 10.1 into 10.2 2019-07-22 15:28:05 +03:00
Marko Mäkelä
a5e268a293 MDEV-20102 Phantom InnoDB table remains after interrupted CREATE...SELECT
This is a regression due to MDEV-16515 that affects some versions in
the MariaDB 10.1 server series starting with 10.1.35, and possibly
all versions starting with 10.2.17, 10.3.8, and 10.4.0.

The idea of MDEV-16515 is to allow DROP TABLE to be interrupted,
in case it was stuck due to some concurrent activity. We already
made some cases of internal DROP TABLE immune to kill in MDEV-18237,
MDEV-16647, MDEV-17470. We must include the cleanup of
CREATE TABLE...SELECT in the list of such internal DROP TABLE.

ha_innobase::delete_table(): Pass create_failed=true if the current
SQL statement is CREATE, so that the table will be dropped.

row_drop_table_for_mysql(): If create_failed=true, do not allow
the operation to be interrupted.
2019-07-22 14:55:46 +03:00
Eugene Kosov
9c29d06862 MDEV-20097 potential use-after-free
row_merge_read_clustered_index(): fix one more place with buf and merge_buf[i]
2019-07-19 11:42:08 +03:00
Marko Mäkelä
6a55aeb2af Merge 10.1 into 10.2 2019-07-18 23:38:48 +03:00
Eugene Kosov
5b205458d9 MDEV-20097 potential use-after-free
row_merge_read_clustered_index(): make buf always equals to merge_buf[i]
2019-07-18 22:28:11 +03:00
Eugene Kosov
26c389b7b7 Merge 10.1 into 10.2 2019-07-09 13:22:22 +03:00
Jan Lindström
b105427745 MDEV-19660: wsrep_rec_get_foreign_key() is dereferencing a stale pointer to a page that was previously latched
In row_ins_foreign_check_on_constraint(), clustered index record is being passed to wsrep_append_foreign_key() after releasing the latch. If a record has been changed by other thread in the meantime then it could lead to a crash when
wsrep_rec_get_foreign_key () tries to access the record.

row_ins_foreign_check_on_constraint
	Use cascade->pcur->old_rec instead of clust_rec.

row_ins_check_foreign_constraint
	Add missing error printout.
2019-07-02 10:06:13 +03:00
Thirunarayanan Balathandayuthapani
723a4b1d78 MDEV-17228 Encrypted temporary tables are not encrypted
- Introduce a new variable called innodb_encrypt_temporary_tables which is
a boolean variable. It decides whether to encrypt the temporary tablespace.
- Encrypts the temporary tablespace based on full checksum format.
- Introduced a new counter to track encrypted and decrypted temporary
tablespace pages.
- Warnings issued if temporary table creation has conflict value with
innodb_encrypt_temporary_tables
- Added a new test case which reads and writes the pages from/to temporary
tablespace.
2019-06-28 19:07:59 +05:30
Thirunarayanan Balathandayuthapani
e9145aab44 MDEV-19435 buf_fix_count > 0 for corrupted page when it exits the LRU list
Problem:
=========
One of the purge thread access the corrupted page and tries to remove from
LRU list. In the mean time, other purge threads are waiting for same page
in buf_wait_for_read(). Assertion(buf_fix_count == 0) fails for the
purge thread which tries to remove the page from LRU list.

Solution:
========
- Set the page id as FIL_NULL to indicate the page is corrupted before
removing the block from LRU list. Acquire hash lock for the particular
page id and wait for the other threads to release buf_fix_count
for the block.

- Added the error check for btr_cur_open() in row_search_on_row_ref().
2019-06-13 16:13:51 +03:00
Marko Mäkelä
cbac8f9351 MDEV-19725 Incorrect error handling in ALTER TABLE
Some I/O functions and macros that are declared in os0file.h used to
return a Boolean status code (nonzero on success). In MySQL 5.7, they
were changed to return dberr_t instead. Alas, in MariaDB Server 10.2,
some uses of functions were not adjusted to the changed return value.

Until MDEV-19231, the valid values of dberr_t were always nonzero.
This means that some code that was incorrectly checking for a zero
return value from the functions would never detect a failure.

After MDEV-19231, some tests for ALTER ONLINE TABLE would fail with
cmake -DPLUGIN_PERFSCHEMA=NO. It turned out that the wrappers
pfs_os_file_read_no_error_handling_int_fd_func() and
pfs_os_file_write_int_fd_func() were wrongly returning
bool instead of dberr_t. Also the callers of these functions were
wrongly expecting bool (nonzero on success) instead of dberr_t.

This mistake had been made when the addition of these functions was
merged from MySQL 5.6.36 and 5.7.18 into MariaDB Server 10.2.7.

This fix also reverts commit 40becbc3c7
which attempted to work around the problem.
2019-06-10 18:15:25 +03:00
Thirunarayanan Balathandayuthapani
bb5d04c9b8 MDEV-19695 Import tablespace doesn't work with ROW_FORMAT=COMPRESSED encrypted tablespace
Problem:
=======
fil_iterate() writes imported tablespace page0 as it is to discarded
tablespace. Space id wasn't even changed. While opening the tablespace,
tablespace fails with space id mismatch error.

Fix:
====
fil_iterate() copies the page0 with discarded space id to imported
tablespace.
2019-06-06 12:54:34 +05:30
Monty
40becbc3c7 Fixed bug in online alter table when not compiled with performance schema
os_file_write_func() and os_file_read_no_error_handling_func() returned
different result values depending on if UNIV_PFS_IO was defined or not.

Other things:
- Added some comments about return values for some functions
2019-06-03 15:06:51 +03:00
Marko Mäkelä
6eefeb6fea MDEV-19541: Avoid infinite loop of reading a corrupted page
row_search_mvcc(): Duplicate the logic of btr_pcur_move_to_next()
so that an infinite loop can be avoided when advancing to the next
page fails due to a corrupted page.
2019-05-29 11:20:56 +03:00
Marko Mäkelä
d59e15bdb9 Merge 10.1 into 10.2 2019-05-28 15:56:24 +03:00
Marko Mäkelä
242a28c320 MDEV-6812: Remove the wrapper my_log2f() 2019-05-28 10:54:30 +03:00
Marko Mäkelä
74904a667e Remove UT_NOT_USED
btr_pcur_move_to_last_on_page(): Merge with the only caller.
2019-05-20 17:09:50 +03:00
Marko Mäkelä
50999738ea Merge 10.1 into 10.2 2019-05-13 18:48:28 +03:00
Marko Mäkelä
b93ecea65c Remove unnecessary pointer indirection for rw_lock_t
In MySQL 5.7.8 an extra level of pointer indirection was added to
dict_operation_lock and some other rw_lock_t without solid justification,
in mysql/mysql-server@52720f1772.

Let us revert that change and remove the rather useless rw_lock_t
constructor and destructor and the magic_n field. In this way,
some unnecessary pointer dereferences and heap allocation will be avoided
and debugging might be a little easier.
2019-05-13 18:46:12 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Marko Mäkelä
7f7211073c MDEV-19441 Typo in error message "InnoDB: FTS Doc ID must be large than"
row_insert_for_mysql(): Correct the grammar error, and
display the table name in both messages.
2019-05-13 08:54:43 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Vicențiu Ciorbaru
f177f125d4 Merge branch '5.5' into 10.1 2019-05-11 19:15:57 +03:00
Vicențiu Ciorbaru
15f1e03d46 Follow-up to changing FSF address
Some places didn't match the previous rules, making the Floor
address wrong.

Additional sed rules:

sed -i -e 's/Place.*Suite .*, Boston/Street, Fifth Floor, Boston/g'
sed -i -e 's/Suite .*, Boston/Fifth Floor, Boston/g'
2019-05-11 18:30:45 +03:00
Marko Mäkelä
8ce702aa90 MDEV-17540 Server crashes in row_purge after TRUNCATE TABLE
row_purge_upd_exist_or_extern_func(): Check for node->vcol_op_failed()
after row_purge_remove_sec_if_poss(), like row_purge_del_mark() did.
This avoids us dereferencing the node->table=NULL pointer.

The test case, submitted by Elena Stepanova, is not deterministic and
does not repeat the bug on 10.2. With the added loop, for me, it reliably
crashes 10.3 without the fix. I was unable to create a deterministic
test case for either 10.2 or 10.3.

Reviewed by Thirunarayanan Balathandayuthapani
2019-05-10 10:50:35 +03:00
Marko Mäkelä
b2f3755c8e Merge 10.1 into 10.2 2019-05-10 08:02:21 +03:00
Marko Mäkelä
f92749ed36 MDEV-18220: heap-use-after-free in fts_get_table_name_prefix()
fts_table_t::parent: Remove the redundant field. Refer to
table->name.m_name instead.

fts_update_sync_doc_id(), fts_update_next_doc_id(): Remove
the redundant parameter table_name.

fts_get_table_name_prefix(): Access the dict_table_t::name.
FIXME: Ensure that this access is always covered by
dict_sys->mutex.
2019-05-10 07:57:01 +03:00
Marko Mäkelä
ce195987c3 MDEV-19385: Inconsistent definition of dtuple_get_nth_v_field()
The accessor dtuple_get_nth_v_field() was defined differently between
debug and release builds in MySQL 5.7.8 in
mysql/mysql-server@c47e1751b7
and a debug assertion to document or enforce the questionable assumption
tuple->v_fields == &tuple->fields[tuple->n_fields] was missing.

This was apparently no problem until MDEV-11369 introduced instant
ADD COLUMN to MariaDB Server 10.3. With that work present, in one
test case, trx_undo_report_insert_virtual() could in release builds
fetch the wrong value for a virtual column.

We replace many of the dtuple_t accessors with const-preserving
inline functions, and fix missing or misleadingly applied const
qualifiers accordingly.
2019-05-03 20:02:50 +03:00
Marko Mäkelä
3db94d2403 MDEV-19346: Remove dummy InnoDB log checkpoints
log_checkpoint(), log_make_checkpoint_at(): Remove the parameter
write_always. It seems that the primary purpose of this parameter
was to ensure in the function recv_reset_logs() that both checkpoint
header pages will be overwritten, when the function is called from
the never-enabled function recv_recovery_from_archive_start().

create_log_files(): Merge recv_reset_logs() to its only caller.

Debug instrumentation: Prefer to flush the redo log, instead of
triggering a redo log checkpoint.

page_header_set_field(): Disable a debug assertion that will
always fail due to MDEV-19344, now that we no longer initiate
a redo log checkpoint before an injected crash.

In recv_reset_logs() there used to be two calls to
log_make_checkpoint_at(). The apparent purpose of this was
to ensure that both InnoDB redo log checkpoint header pages
will be initialized or overwritten.
The second call was removed (without any explanation) in MySQL 5.6.3:
mysql/mysql-server@4ca37968da

In MySQL 5.6.8 WL#6494, starting with
mysql/mysql-server@00a0ba8ad9
the function recv_reset_logs() was not only invoked during
InnoDB data file initialization, but also during a regular
startup when the redo log is being resized.

mysql/mysql-server@45e9167983
in MySQL 5.7.2 removed the UNIV_LOG_ARCHIVE code, but still
did not remove the parameter write_always.
2019-05-03 20:02:11 +03:00
Marko Mäkelä
bc145193c1 Merge 10.1 into 10.2 2019-04-25 09:04:09 +03:00
Marko Mäkelä
376bf4ede5 MDEV-19241 InnoDB fails to write MLOG_INDEX_LOAD upon completing ALTER TABLE
Similar to what was done in commit aa3f7a107c
for FULLTEXT INDEX, we must ensure that MLOG_INDEX_LOAD records will always
be written if redo logging was disabled.

row_merge_build_indexes(): Invoke row_merge_write_redo() also when
online operation is not being executed or an error occurs.
In case of an error, invoke flush_observer->interrupted() so that
the pages will not be flushed but merely evicted from the buffer pool.
Before resuming redo logging, it is crucial for the correctness of
mariabackup and InnoDB crash recovery to flush or evict all affected pages
and to write MLOG_INDEX_LOAD records.
2019-04-17 13:58:22 +03:00
Marko Mäkelä
7f5849a809 MDEV-18309: Remove unused code 2019-04-07 12:05:12 +03:00
Marko Mäkelä
867617a976 MDEV-18309: InnoDB reports bogus errors about missing #sql-*.ibd on startup
This is a follow-up to MDEV-18733. As part of that fix, we made
dict_check_sys_tables() skip tables that would be dropped by
row_mysql_drop_garbage_tables().

DICT_ERR_IGNORE_DROP: A new mode where the file should not be attempted
to be opened.

dict_load_tablespace(): Do not try to load the tablespace if
DICT_ERR_IGNORE_DROP has been specified.

row_mysql_drop_garbage_tables(): Pass the DICT_ERR_IGNORE_DROP mode.

fil_space_for_table_exists_in_mem(): Remove a parameter.
The only caller that passed print_error_if_does_not_exist=true
was row_drop_single_table_tablespace().
2019-04-07 10:57:38 +03:00
Marko Mäkelä
aa3f7a107c MDEV-12699 preparation: Write MLOG_INDEX_LOAD for FTS_ tables
The record MLOG_INDEX_LOAD is supposed to be written to indicate that
some page modifications bypassed redo logging, and that redo logging
is now re-enabled. It was not written for fulltext indexes during
ALTER TABLE.

row_merge_write_redo(): Declare globally. Assert that the index
is neither a spatial nor fulltext index.

recv_mlog_index_load(): Observe a MLOG_INDEX_LOAD operation.

recv_parse_log_recs(): Handle MLOG_INDEX_LOAD also in multi-record
mini-transactions. Because of this omission, we should keep writing
MLOG_INDEX_LOAD in single-record mini-transactions, because older
versions of Mariabackup would fail.

row_fts_merge_insert(): Write MLOG_INDEX_LOAD for the auxiliary
tables of fulltext indexes.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
f602385776 Do not pass table_name_t to printf-like functions 2019-04-04 08:57:53 +03:00
Marko Mäkelä
a1ec7ac4f4 Clean up table_name_t
row_is_mysql_tmp_table_name(): Replaced with
dict_table_t::is_temporary_name() and table_name_t::is_temporary().

table_name_t: Add constructors.
2019-04-03 16:09:16 +03:00
Marko Mäkelä
dbc716675b Merge 10.1 into 10.2 2019-04-03 10:32:21 +03:00
Marko Mäkelä
c0fca2863b Fix -Wnonnull-compare
InnoDB and XtraDB had redundant assertions for checking that
function parameters that were declared as nonnull were not NULL.
2019-04-03 09:46:49 +03:00
Marko Mäkelä
d0116e10a5 Revert MDEV-18464 and MDEV-12009
This reverts commit 21b2fada7a
and commit 81d71ee6b2.

The MDEV-18464 change introduces a few data race issues. Contrary to
the documentation, the field trx_t::victim is not always being protected
by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems
that KILL QUERY could wrongly avoid acquiring both mutexes when
invoking lock_trx_handle_wait_low(), in case another thread had
already set trx->victim=true.

We also revert MDEV-12009, because it should depend on the MDEV-18464
fix being present.
2019-03-28 12:39:50 +02:00
Thirunarayanan Balathandayuthapani
0623cc7c16 MDEV-19051 Avoid unnecessary writing MLOG_INDEX_LOAD
1) Avoid writing of MLOG_INDEX_LOAD redo log record during inplace
alter table when the table is empty and also for spatial index.

2) Avoid creation of temporary merge file for spatial index during
index creation process.
2019-03-28 10:55:18 +02:00
Jan Lindström
21b2fada7a MDEV-18464: Port kill_one_trx fixes from 10.4 to 10.1
Pushed the decision for innodb transaction and system
locking down to lock0lock.cc level. With this,
we can avoid releasing these mutexes for executions
where these mutexes were acquired upfront.

This patch will also fix BF aborting of native threads, e.g.
threads which have declared wsrep_on=OFF. Earlier, we have
used, for innodb trx locks, was_chosen_as_deadlock_victim
flag, for marking inodb transactions, which are victims for
wsrep BF abort. With native threads (wsrep_on==OFF), re-using
was_chosen_as_deadlock_victim flag may lead to inteference
with real deadlock, and to deal with this, the patch has added new
flag for marking wsrep BF aborts only: victim=true

Similar way if replication decides to abort one of the threads
we mark victim by: victim=true

innobase_kill_query
	Remove lock sys and trx mutex handling.

wsrep_innobase_kill_one_trx
	Mark victim trx with victim=true

trx0trx.h
	Remove trx_abort_t type and abort type variable from
	trx struct. Add victim variable to trx.

wsrep_kill_victim
	Remove abort_type

lock_report_waiters_to_mysql
	Take also trx mutex and mark trx as a victim for
	replication abort.

lock_trx_handle_wait_low
	New low level function to check whether the transaction
	has already been rolled back because it was selected as
	a deadlock victim, or if it has to wait then cancel
	the wait lock.

lock_trx_handle_wait
	If transaction is not marked as victim take lock sys
	and trx mutex before calling lock_trx_handle_wait_low
	and release them after that.

row_search_for_mysql
	Remove lock sys and trx mutex taking and releasing.

trx_rollback_to_savepoint_for_mysql_low
trx_commit_in_memory
	Clean up victim variable.
2019-03-28 07:40:03 +02:00
Sergei Golubchik
f97d879bf8 cmake: re-enable -Werror in the maintainer mode
now we can afford it. Fix -Werror errors. Note:
* old gcc is bad at detecting uninit variables, disable it.
* time_t is int or long, cast it for printf's
2019-03-27 22:51:37 +01:00
Marko Mäkelä
b59d484696 MDEV-14126: Remove page_is_root()
The predicate page_is_root(), which was added in MariaDB Server 10.2.2,
is based on a wrong assumption.

Under some circumstances, InnoDB can transform B-trees into a degenerate
state where a non-leaf page has no sibling pages. Because of this,
we cannot assume that a page that has no siblings is the root page.
This bug will be tracked as MDEV-19022.

Because of the bug that may affect many InnoDB data files, we must remove
and replace the wrong predicate. Using the wrong predicate can cause
corruption. A leaf page is not allowed to be empty except if it is the
root page, and the entire table is empty.
2019-03-25 10:53:00 +02:00
Marko Mäkelä
185a5000e6 Silence bogus -Wmaybe-uninitialized 2019-03-21 09:24:03 +02:00
Marko Mäkelä
630199e724 MDEV-18981 Possible corruption when using FOREIGN KEY with virtual columns
row_ins_foreign_fill_virtual(): Construct update->old_vrow
with ROW_COPY_DATA instead of ROW_COPY_POINTERS. With the latter,
the object would be pointing to a buffer pool page frame. That page
frame can become stale and invalid as soon as
row_ins_foreign_check_on_constraint() invokes mtr_t::commit().

Most of the time, the pointer target is not going to be overwritten
by anything, and everything appears to work correctly.
Buffer pool page replacement is highly unlikely, and any pessimistic
operation that would overwrite the old location of the record is only
slightly more likely. It is not known whether there is an actual bug.
This came up while diagnosing MDEV-18879 in MariaDB 10.3.
2019-03-20 18:34:49 +02:00
Marko Mäkelä
56304875f7 MDEV-18836 ASAN: heap-use-after-free after TRUNCATE
row_drop_tables_for_mysql_in_background(): Copy the table name
before closing the table handle, to avoid heap-use-after-free if
another thread succeeds in dropping the table before
row_drop_table_for_mysql_in_background() completes the table name lookup.

dict_mem_create_temporary_tablename(): With innodb_safe_truncate=ON
(the default), generate a simple, unique, collision-free table name
using only the id, no pseudorandom component. This is safe, because
on startup, we will drop any #sql tables that might exist in InnoDB.
This is a backport from 10.3. It should have been backported already
as part of backporting MDEV-14717,MDEV-14585 which were prerequisites
for the MDEV-13564 backup-friendly TRUNCATE TABLE.
This seems to reduce the chance of table creation failures in
ha_innobase::truncate().

ha_innobase::truncate(): Do not invoke close(), but instead
mimic it, so that we can restore to the original table handle
in case opening the truncated copy of the table failed.
2019-03-13 13:31:45 +02:00
Marko Mäkelä
ee17f7285a TruncateLogParser::scan(): Simplify string operations
This also fixes -Wstringop-truncation that GCC 8 issued for strncpy().
2019-03-12 19:46:28 +02:00