Commit graph

5243 commits

Author SHA1 Message Date
Marko Mäkelä
12f804acfa MDEV-14441 Deadlock due to InnoDB adaptive hash index
This is mere code clean-up; the reported problem was already fixed
in commit 3fdd390791.

row_sel(): Remove the variable search_latch_locked.

row_sel_try_search_shortcut(): Remove the parameter
search_latch_locked, which was always passed as nonzero.

row_sel_try_search_shortcut(), row_sel_try_search_shortcut_for_mysql():
Do not expect the caller to acquire the AHI latch. Instead,
acquire and release it inside this function.

row_search_mvcc(): Remove a bogus condition on mysql_n_tables_locked.
When the btr_search_latch was split into an array of latches
in MySQL 5.7.8 as part of the Oracle Bug#20985298 fix, the "caching"
of the latch across storage engine API calls was removed, and
thus it is unnecessary to avoid adaptive hash index searches
during INSERT...SELECT.
2018-01-15 19:18:47 +02:00
Marko Mäkelä
458e33cfbc MDEV-14441 Deadlock due to InnoDB adaptive hash index
This is not fixing the reported problem, but a potential problem that was
introduced in MDEV-11369.

row_sel_try_search_shortcut(), row_sel_try_search_shortcut_for_mysql():
When an adaptive hash index search lands on top of rec_is_default_row(),
we must skip the candidate and perform a normal search. This is because
the adaptive hash index latch only protects the record from being deleted
but does not prevent concurrent inserts into the page. Therefore, it is not
safe to dereference the next-record pointer.
2018-01-15 19:12:30 +02:00
Marko Mäkelä
4ef25dbfd8 Merge bb-10.2-ext into 10.3 2018-01-15 19:11:28 +02:00
Marko Mäkelä
e2e740030d Merge 10.2 into bb-10.2-ext 2018-01-15 19:07:02 +02:00
Marko Mäkelä
3fdd390791 MDEV-14441 InnoDB hangs when setting innodb_adaptive_hash_index=OFF during UPDATE
This race condition is a regression caused by MDEV-12121.

btr_cur_update_in_place(): Determine block->index!=NULL only once
in order to determine whether an adaptive hash index bucket needs
to be exclusively locked and unlocked.

If we evaluated block->index multiple times, and the adaptive hash
index was disabled before we locked the adaptive hash index, then
we would never release the adaptive hash index bucket latch, which
would eventually lead to InnoDB hanging.
2018-01-15 19:02:38 +02:00
Marko Mäkelä
39f236a2f5 Merge 10.2 into bb-10.2-ext 2018-01-15 16:41:10 +02:00
Marko Mäkelä
ec062c6181 MDEV-12121 follow-up: Unbreak the WITH_INNODB_AHI=OFF build 2018-01-15 15:40:28 +02:00
Marko Mäkelä
3d798be1d4 MDEV-14655 Assertion `!fts_index' failed in prepare_inplace_alter_table_dict
MariaDB inherits the MySQL limitation that ALGORITHM=INPLACE cannot
create more than one FULLTEXT INDEX at a time. As part of the MDEV-11369
Instant ADD COLUMN refactoring, MariaDB 10.3.2 accidentally stopped
enforcing the restriction.

Actually, it is a bug in MySQL 5.6 and MariaDB 10.0 that an ALTER TABLE
statement with multiple ADD FULLTEXT INDEX but without explicit
ALGORITHM=INPLACE would return in an error message, rather than
executing the operation with ALGORITHM=COPY.

ha_innobase::check_if_supported_inplace_alter(): Enforce the restriction
on multiple FULLTEXT INDEX.

prepare_inplace_alter_table_dict(): Replace some code with debug
assertions. A "goto error_handled" at this point would result in
another error, because the reference count of ctx->new_table would be 0.
2018-01-15 10:57:16 +02:00
Marko Mäkelä
70fff3688d Merge bb-10.2-ext into 10.3 2018-01-13 18:25:24 +02:00
Marko Mäkelä
bec2712775 Merge 10.2 into bb-10.2-ext 2018-01-13 18:18:28 +02:00
Marko Mäkelä
fc65577873 MDEV-14887 On a 32-bit system, MariaDB 10.2 mishandles data file sizes exceeding 4GiB
This is a regression that was introduced in MySQL 5.7.6 in
19855664de

fil_node_open_file(): Use proper 64-bit arithmetics for truncating
size_bytes to a multiple of a file extent size.
2018-01-13 18:15:04 +02:00
Sergey Vojtovich
0a63b50c7a Cleanup UT_LOW_PRIORITY_CPU/UT_RESUME_PRIORITY_CPU
Server already has HMT_low/HMT_medium.
2018-01-13 13:08:59 +04:00
Marko Mäkelä
3e6fcb6ac8 MDEV-14935 Remove bogus conditions related to not redo-logging PAGE_MAX_TRX_ID changes
InnoDB originally skipped the redo logging of PAGE_MAX_TRX_ID changes
until I enabled it in commit e76b873f24
that was part of MySQL 5.5.5 already.

Later, when a more complete history of the InnoDB Plugin for MySQL 5.1
(aka branches/zip in the InnoDB subversion repository) and of the
planned-to-be closed-source branches/innodb+ that became the basis of
InnoDB in MySQL 5.5 was pushed to the MySQL source repository, the
change was part of commit 509e761f06:

 ------------------------------------------------------------------------
 r5038 | marko | 2009-05-19 22:59:07 +0300 (Tue, 19 May 2009) | 30 lines

 branches/zip: Write PAGE_MAX_TRX_ID to the redo log. Otherwise,
 transactions that are started before the rollback of incomplete
 transactions has finished may have an inconsistent view of the
 secondary indexes.

 dict_index_is_sec_or_ibuf(): Auxiliary function for controlling
 updates and checks of PAGE_MAX_TRX_ID: check whether an index is a
 secondary index or the insert buffer tree.

 page_set_max_trx_id(), page_update_max_trx_id(),
 lock_rec_insert_check_and_lock(),
 lock_sec_rec_modify_check_and_lock(), btr_cur_ins_lock_and_undo(),
 btr_cur_upd_lock_and_undo(): Add the parameter mtr.

 page_set_max_trx_id(): Allow mtr to be NULL.  When mtr==NULL, do not
 attempt to write to the redo log.  This only occurs when creating a
 page or reorganizing a compressed page.  In these cases, the
 PAGE_MAX_TRX_ID will be set correctly during the application of redo
 log records, even though there is no explicit log record about it.

 btr_discard_only_page_on_level(): Preserve PAGE_MAX_TRX_ID.  This
 function should be unreachable, though.

 btr_cur_pessimistic_update(): Update PAGE_MAX_TRX_ID.

 Add some assertions for checking that PAGE_MAX_TRX_ID is set on all
 secondary index leaf pages.

 rb://115 tested by Michael, fixes Issue #211
 ------------------------------------------------------------------------

After this fix, some bogus references to recv_recovery_is_on()
remained. Also, some references could be replaced with
references to index->is_dummy to prepare us for MDEV-14481
(background redo log apply).
2018-01-12 18:31:03 +02:00
Marko Mäkelä
6dd302d164 Merge bb-10.2-ext into 10.3 2018-01-11 19:44:41 +02:00
Marko Mäkelä
cca611d1c0 Merge 10.2 into bb-10.2-ext 2018-01-11 18:00:31 +02:00
Marko Mäkelä
773c3ceb57 MDEV-14824 Assertion `!trx_is_started(trx)' failed in innobase_start_trx_and_assign_read_view
In CREATE SEQUENCE or CREATE TEMPORARY SEQUENCE, we should not start
an InnoDB transaction for inserting the sequence status record into
the underlying no-rollback table. Because we did this, a debug assertion
failure would fail in START TRANSACTION WITH CONSISTENT SNAPSHOT after
CREATE TEMPORARY SEQUENCE was executed.

row_ins_step(): Do not start the transaction. Let the caller do that.

que_thr_step(): Start the transaction before calling row_ins_step().

row_ins_clust_index_entry(): Skip locking and undo logging for no-rollback
tables, even for temporary no-rollback tables.

row_ins_index_entry(): Allow trx->id==0 for no-rollback tables.

row_insert_for_mysql(): Do not start a transaction for no-rollback tables.
2018-01-11 16:34:31 +02:00
Marko Mäkelä
e9842de20c Merge 10.1 into 10.2 2018-01-11 12:05:57 +02:00
Marko Mäkelä
c15b3d2d41 Merge 10.0 into 10.1 2018-01-11 10:44:05 +02:00
Sergey Vojtovich
0ca2ea1a65 MDEV-14638 - Replace trx_sys_t::rw_trx_set with LF_HASH
trx reference counter was updated under mutex and read without any
protection. This is both slow and unsafe. Use atomic operations for
reference counter accesses.
2018-01-11 12:30:53 +04:00
Sergey Vojtovich
380069c235 MDEV-14638 - Replace trx_sys_t::rw_trx_set with LF_HASH
trx_sys_t::rw_trx_set is implemented as std::set, which does a few quite
expensive operations under trx_sys_t::mutex protection: e.g. malloc/free
when adding/removing elements. Traversing b-tree is not that cheap either.

This has negative scalability impact, which is especially visible when running
oltp_update_index.lua benchmark on a ramdisk.

To reduce trx_sys_t::mutex contention std::set is replaced with LF_HASH. None
of LF_HASH operations require trx_sys_t::mutex (nor any other global mutex)
protection.

Another interesting issue observed with std::set is reproducible ~2% performance
decline after benchmark is ran for ~60 seconds. With LF_HASH results are stable.

All in all this patch optimises away one of three trx_sys->mutex locks per
oltp_update_index.lua query. The other two critical sections became smaller.

Relevant clean-ups:

Replaced rw_trx_set iteration at startup with local set. The latter is needed
because values inserted to rw_trx_list must be ordered by trx->id.

Removed redundant conditions from trx_reference(): it is (and even was) never
called with transactions that have trx->state == TRX_STATE_COMMITTED_IN_MEMORY.
do_ref_count doesn't (and probably even didn't) make any sense: now it is called
only when reference counter increment is actually requested.

Moved condition out of mutex in trx_erase_lists().

trx_rw_is_active(), trx_rw_is_active_low() and trx_get_rw_trx_by_id() were
greatly simplified and replaced by appropriate trx_rw_hash_t methods.

Compared to rw_trx_set, rw_trx_hash holds transactions only in PREPARED or
ACTIVE states. Transactions in COMMITTED state were required to be found
at InnoDB startup only. They are now looked up in the local set.

Removed unused trx_assert_recovered().

Removed unused innobase_get_trx() declaration.

Removed rather semantically incorrect trx_sys_rw_trx_add().

Moved information printout from trx_sys_init_at_db_start() to
trx_lists_init_at_db_start().
2018-01-11 12:30:53 +04:00
Marko Mäkelä
4c1479545d Merge 5.5 into 10.0 2018-01-11 10:16:52 +02:00
Marko Mäkelä
bdcd7f79e4 MDEV-14916 InnoDB reports warning for "Purge reached the head of the history list"
The warning was originally added in
commit c67663054a
(MySQL 4.1.12, 5.0.3) to trace claimed undo log corruption that
was analyzed in https://lists.mysql.com/mysql/176250
on November 9, 2004.

Originally, the limit was 20,000 undo log headers or transactions,
but in commit 9d6d1902e0
in MySQL 5.5.11 it was increased to 2,000,000.

The message can be triggered when the progress of purge is prevented
by a long-running transaction (or just an idle transaction whose
read view was started a long time ago), by running many transactions
that UPDATE or DELETE some records, then starting another transaction
with a read view, and finally by executing more than 2,000,000
transactions that UPDATE or DELETE records in InnoDB tables. Finally,
when the oldest long-running transaction is completed, purge would
run up to the next-oldest transaction, and there would still be more
than 2,000,000 transactions to purge.

Because the message can be triggered when the database is obviously
not corrupted, it should be removed. Heavy users of InnoDB should be
monitoring the "History list length" in SHOW ENGINE INNODB STATUS;
there is no need to spam the error log.
2018-01-11 09:55:10 +02:00
Marko Mäkelä
dfde5ae912 MDEV-14130 InnoDB messages should not refer to the MySQL 5.7 manual
Replace most occurrences of the REFMAN macro. For some pages there
is no replacement yet.
2018-01-10 13:53:44 +02:00
Marko Mäkelä
d1cf9b167c MDEV-14909 MariaDB 10.2 refuses to start up after clean shutdown of MariaDB 10.3
recv_log_recover_10_3(): Determine if a log from MariaDB 10.3 is clean.

recv_find_max_checkpoint(): Allow startup with a clean 10.3 redo log.

srv_prepare_to_delete_redo_log_files(): When starting up with a 10.3 log,
display a "Downgrading redo log" message instead of "Upgrading".
2018-01-10 13:18:02 +02:00
Aleksey Midenkov
c59c1a0736 System Versioning 1.0 pre8
Merge branch '10.3' into trunk
2018-01-10 12:36:55 +03:00
Marko Mäkelä
0b597d3ab2 Follow-up to MDEV-14837: Relax a too strict assertion 2018-01-09 14:50:02 +02:00
Sergei Golubchik
b85efdc3af rename system_time columns
sys_trx_start -> row_start
sys_trx_end -> row_end
2018-01-09 15:49:07 +03:00
Marko Mäkelä
fe79ac5b0e MDEV-14837 Duplicate primary keys are allowed after ADD COLUMN / UPDATE
This bug affected tables where the PRIMARY KEY contains variable-length
columns, and ROW_FORMAT is COMPACT or DYNAMIC.

rec_init_offsets_comp_ordinary(): Do not short-cut the parsing
of the record header for records that contain explicit values
for instantly added columns.

rec_copy_prefix_to_buf(): Copy more header for records that
contain explicit values for instantly added columns.
2018-01-09 13:48:41 +02:00
Marko Mäkelä
5a1283a4fa Follow-up to MDEV-12288: Add --debug=d,purge diagnostics
row_purge_reset_trx_id(): Display a DBUG message about resetting the
DB_TRX_ID.
2018-01-09 13:48:41 +02:00
Sergei Golubchik
e52a237fe9 remove ifdefs around PSI_THREAD_CALL
same change as for PSI_TABLE_CALL
2018-01-09 14:21:20 +03:00
Jan Lindström
07aa985979 MDEV-14776: InnoDB Monitor output generated by specific error is flooding error logs
innodb/buf_LRU_get_free_block
	Add debug instrumentation to produce error message about
	no free pages. Print error message only once and do not
	enable innodb monitor.

xtradb/buf_LRU_get_free_block
	Add debug instrumentation to produce error message about
	no free pages. Print error message only once and do not
	enable innodb monitor. Remove code that does not seem to
	be used.

innodb-lru-force-no-free-page.test
	New test case to force produce desired error message.
2018-01-09 12:48:31 +02:00
Marko Mäkelä
075f61a1d4 Revert part of commit fec844aca8
row_insert_for_mysql(): Remove some duplicated code
2018-01-09 11:30:36 +02:00
Marko Mäkelä
d8eef0f611 Merge 10.1 into 10.2 2018-01-08 16:49:31 +02:00
Marko Mäkelä
29b6e809a9 Merge 10.0 into 10.1 2018-01-08 14:51:20 +02:00
Marko Mäkelä
c903ba2f1e MDEV-13205 InnoDB: Failing assertion: !dict_index_is_online_ddl(index) upon ALTER TABLE
dict_foreign_find_index(): Ignore incompletely created indexes.
After a failed ADD UNIQUE INDEX, an incompletely created index
could be left behind until the next ALTER TABLE statement.
2018-01-08 14:26:55 +02:00
Marko Mäkelä
899c5899be MLOG-13101 Debug assertion failed in recv_parse_or_apply_log_rec_body()
recv_parse_or_apply_log_rec_body(): Tolerate MLOG_4BYTES for
dummy-writing the FIL_PAGE_SPACE_ID, written by fil_crypt_rotate_page().
2018-01-08 13:00:04 +02:00
Marko Mäkelä
8099941b46 MDEV-13487 Assertion failure in rec_get_trx_id()
rec_get_trx_id(): Because rec is not necessarily residing in
a buffer pool page (it could be an old version of a clustered index
record, allocated from heap), remove the debug assertions that
depend on page_align(rec).
2018-01-08 13:00:04 +02:00
Marko Mäkelä
16d308e21d MDEV-14874 innodb_encrypt_log corrupts the log when the LSN crosses 32-bit boundary
This bug affects both writing and reading encrypted redo log in
MariaDB 10.1, starting from version 10.1.3 which added support for
innodb_encrypt_log. That is, InnoDB crash recovery and Mariabackup
will sometimes fail when innodb_encrypt_log is used.

MariaDB 10.2 or Mariabackup 10.2 or later versions are not affected.

log_block_get_start_lsn(): Remove. This function would cause trouble if
a log segment that is being read is crossing a 32-bit boundary of the LSN,
because this function does not allow the most significant 32 bits of the
LSN to change.

log_blocks_crypt(), log_encrypt_before_write(), log_decrypt_after_read():
Add the parameter "lsn" for the start LSN of the block.

log_blocks_encrypt(): Remove (unused function).
2018-01-08 09:44:40 +02:00
Marko Mäkelä
fa7d85bb87 Merge bb-10.2-ext into 10.3 2018-01-05 22:52:06 +02:00
Marko Mäkelä
6feb74c4b2 row_upd_rec_in_place(): Relax a debug assertion 2018-01-05 22:50:28 +02:00
Monty
e9a2082634 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext
Conflicts:
	mysql-test/r/cte_nonrecursive.result
	mysql-test/suite/galera/r/galera_bf_abort.result
	mysql-test/suite/galera/r/galera_bf_abort_get_lock.result
	mysql-test/suite/galera/r/galera_bf_abort_sleep.result
	mysql-test/suite/galera/r/galera_enum.result
	mysql-test/suite/galera/r/galera_fk_conflict.result
	mysql-test/suite/galera/r/galera_insert_multi.result
	mysql-test/suite/galera/r/galera_many_indexes.result
	mysql-test/suite/galera/r/galera_mdl_race.result
	mysql-test/suite/galera/r/galera_nopk_bit.result
	mysql-test/suite/galera/r/galera_nopk_blob.result
	mysql-test/suite/galera/r/galera_nopk_large_varchar.result
	mysql-test/suite/galera/r/galera_nopk_unicode.result
	mysql-test/suite/galera/r/galera_pk_bigint_signed.result
	mysql-test/suite/galera/r/galera_pk_bigint_unsigned.result
	mysql-test/suite/galera/r/galera_serializable.result
	mysql-test/suite/galera/r/galera_toi_drop_database.result
	mysql-test/suite/galera/r/galera_toi_lock_exclusive.result
	mysql-test/suite/galera/r/galera_toi_truncate.result
	mysql-test/suite/galera/r/galera_unicode_pk.result
	mysql-test/suite/galera/r/galera_var_auto_inc_control_off.result
	mysql-test/suite/galera/r/galera_wsrep_log_conficts.result
	sql/field.cc
	sql/rpl_gtid.cc
	sql/share/errmsg-utf8.txt
	sql/sql_acl.cc
	sql/sql_parse.cc
	sql/sql_partition_admin.cc
	sql/sql_prepare.cc
	sql/sql_repl.cc
	sql/sql_table.cc
	sql/sql_yacc.yy
2018-01-05 16:52:40 +02:00
Marko Mäkelä
9c9db1cbe2 MDEV-14059 Work around a problem exposed by InnoDB GIS debug check
row_sel_get_clust_rec_for_mysql(): Look up the page from the
buffer pool, similar to how MySQL 5.7 does it.
2018-01-05 12:10:16 +02:00
Marko Mäkelä
c8e6364407 Merge branch 10.1 into 10.2 2018-01-04 20:47:34 +02:00
Marko Mäkelä
21470de148 Merge 10.0 into 10.1 2018-01-04 20:42:29 +02:00
Marko Mäkelä
4496fd71f4 Fix a truncation warning introduced in MDEV-12323 2018-01-04 20:39:00 +02:00
Marko Mäkelä
218dbf68b8 MDEV-14058 InnoDB Assertion failure !leaf on rem0rec.cc line 566 on test innodb_gis.rtree_recovery
The function rtr_update_mbr_field_in_place() is generating
MLOG_REC_UPDATE_IN_PLACE or MLOG_COMP_REC_UPDATE_IN_PLACE records
on non-leaf pages, even though MLOG_WRITE_STRING would perfectly
suffice for updating a fixed-length data field.

btr_cur_parse_update_in_place(): If flags==7, the record may be
from rtr_update_mbr_field_in_place(), and we must check if the
page is a leaf page. Otherwise, assume that it is.

btr_cur_update_in_place(): Assert that the page is a leaf page.
2018-01-04 19:35:53 +02:00
Marko Mäkelä
c9ad134e56 Relax a bogus debug assertion
While insert direction makes no sense for SPATIAL INDEX (R-tree),
the field is apparently being used (and basically garbage).
Relax the debug assertion that was added in MDEV-11369.
2018-01-04 17:59:58 +02:00
Monty
5e0b13d173 Fixed wrong arguments to printf and related functions
Other things, mainly to get
create_mysqld_error_find_printf_error tool to work:

- Added protection to not include mysqld_error.h twice
- Include "unireg.h" instead of "mysqld_error.h" in server
- Added protection if ER_XX messages are already defined
- Removed wrong calls to my_error(ER_OUTOFMEMORY) as
  my_malloc() and my_alloc will do this automatically
- Added missing %s to ER_DUP_QUERY_NAME
- Removed old and wrong calls to my_strerror() when using
  MY_ERROR_ON_RENAME (wrong merge)
- Fixed deadlock error message from Galera. Before the extra
  information given to ER_LOCK_DEADLOCK was missing because
  ER_LOCK_DEADLOCK doesn't provide any extra information.

I kept #ifdef mysqld_error_find_printf_error_used in sql_acl.h
to make it easy to do this kind of check again in the future
2018-01-04 16:24:09 +02:00
Marko Mäkelä
145ae15a33 Merge bb-10.2-ext into 10.3 2018-01-04 09:22:59 +02:00
Marko Mäkelä
acd2862e65 MDEV-14848 MariaDB 10.3 refuses InnoDB crash-upgrade from MariaDB 10.2
While the redo log format was changed in MariaDB 10.3.2 and 10.3.3
due to MDEV-12288 and MDEV-11369, it should be technically possible
to upgrade from a crashed MariaDB 10.2 instance.

On a related note, it should be possible for Mariabackup 10.3
to create a backup from a running MariaDB Server 10.2.

mlog_id_t: Put back the 10.2 specific redo log record types
MLOG_UNDO_INSERT, MLOG_UNDO_ERASE_END, MLOG_UNDO_INIT,
MLOG_UNDO_HDR_REUSE.

trx_undo_parse_add_undo_rec(): Parse or apply MLOG_UNDO_INSERT.

trx_undo_erase_page_end(): Apply MLOG_UNDO_ERASE_END.

trx_undo_parse_page_init(): Parse or apply MLOG_UNDO_INIT.

trx_undo_parse_page_header_reuse(): Parse or apply MLOG_UNDO_HDR_REUSE.

recv_log_recover_10_2(): Remove. Always parse the redo log from 10.2.

recv_find_max_checkpoint(), recv_recovery_from_checkpoint_start():
Always parse the redo log from MariaDB 10.2.

recv_parse_or_apply_log_rec_body(): Parse or apply
MLOG_UNDO_INSERT, MLOG_UNDO_ERASE_END, MLOG_UNDO_INIT.

srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Upgrade from a previous (supported)
redo log format.
2018-01-03 19:08:50 +02:00