Commit graph

3012 commits

Author SHA1 Message Date
Marko Mäkelä
8b6cfda631 Merge 10.4 into 10.5 2020-02-07 08:51:20 +02:00
Marko Mäkelä
8b97eba31b MDEV-21674 purge_sys.stop() fails to wait for purge workers to complete
Since commit 5e62b6a5e0 (MDEV-16264),
purge_sys_t::stop() no longer waited for all purge activity to stop.

This caused problems on FLUSH TABLES...FOR EXPORT because of
purge running concurrently with the buffer pool flush.
The assertion at the end of buf_flush_dirty_pages() could fail.

The, implemented by Vladislav Vaintroub, aims to eliminate race
conditions when stopping or resuming purge:

waitable_task::disable(): Wait for the task to complete, then replace
the task callback function with noop.

waitable_task::enable(): Restore the original task callback function
after disable().

purge_sys_t::stop(): Invoke purge_coordinator_task.disable().

purge_sys_t::resume(): Invoke purge_coordinator_task.enable().

purge_sys_t::running(): Add const qualifier, and clarify the comment.
The purge coordinator task will remain active as long as any purge
worker task is active.

purge_worker_callback(): Assert purge_sys.running().

srv_purge_wakeup(): Merge with the only caller purge_sys_t::resume().

purge_coordinator_task: Use static linkage.
2020-02-07 08:12:58 +02:00
Marko Mäkelä
6d214415c9 MDEV-21351: Free processed recv_sys_t::blocks
Release memory as soon as redo log records are processed.

Because the memory allocation and deallocation of parsed redo log
records must be protected by recv_sys.mutex, it is better to avoid
using a std::atomic field for bookkeeping.

buf_page_t::access_time: Keep track of the recv_sys.pages record
allocations. The most significant 16 bits will count allocated
blocks (which were previously counted by buf_page_t::buf_fix_count
in the debug version), and the least significant 16 bits indicate
the number of allocated bytes in the block (which was previously
managed in buf_block_t::modify_clock), which must be a positive
number, up to innodb_page_size. The byte offset 65536 is represented
as the value 0.

recv_recover_page(): Let the caller erase the log.

recv_validate_tablespace(): Acquire recv_sys_t::mutex.
2020-02-06 09:00:19 +02:00
Marko Mäkelä
a9d1324867 Cleanup: Remove mem_block_t::magic_n and mem_block_validate()
Use of freed memory is better caught by AddressSanitizer,
especially with ASAN_POISON_MEMORY_REGION that is aliased
by MEM_NOACCESS and UNIV_MEM_FREE.
2020-02-03 12:34:08 +02:00
Eugene Kosov
691c691adc clean up redo log
main change: rename first redo log without file close

second change: use os_offset_t to represent offset in a file

third change: fix log texts
2020-02-01 23:58:24 +08:00
Marko Mäkelä
1b414c0313 MDEV-21256 after-merge fix: Use std::atomic
Starting with MariaDB Server 10.4, C++11 is being used.
Hence, std::atomic should be preferred to my_atomic.
2020-02-01 15:06:12 +02:00
Marko Mäkelä
4b291588bb MDEV-19845: Make my_cpu.h self-contained
Fix up commit f5c080c735
2020-02-01 14:56:05 +02:00
Eugene Kosov
bd36a4ca12 introduce HASH_REPLACE() for hash_table_t
HASH_REPLACE(): allows to not travel through linked list twice
when HASH_INSERT() happens right after HASH_DELETE()
2020-01-31 22:14:18 +08:00
Marko Mäkelä
5defdc382b Cleanup: Remove mtr_state_t and mtr_t::m_state
mtr_t::is_active(), mtr_t::is_committed(): Make debug-only.
2020-01-29 14:28:45 +02:00
Marko Mäkelä
50324ce624 MDEV-21351 Replace recv_sys.heap with list of buf_block_t
InnoDB crash recovery used a special type of mem_heap_t that
allocates backing store from the buffer pool. That incurred
a significant overhead, leading to underutilization of memory,
and limiting the maximum contiguous allocated size of a log record.

recv_sys_t::blocks: A linked list of buf_block_t that are allocated
by buf_block_alloc() for redo log records. Replaces recv_sys_t::heap.
We repurpose buf_block_t::unzip_LRU for linking the elements.

recv_sys_t::max_log_blocks: Renamed from recv_n_pool_free_frames.

recv_sys_t::max_blocks(): Accessor for max_log_blocks.

recv_sys_t::alloc(): Allocate memory from the current recv_sys_t::blocks
element, or allocate another block.  In debug builds, various free()
member functions must be invoked, because we repurpose
buf_page_t::buf_fix_count for tracking allocations.

recv_sys_t::free_corrupted_page(): Renamed from recv_recover_corrupt_page()

recv_sys_t::is_memory_exhausted(): Renamed from recv_sys_heap_check()

recv_sys_t::pages and its elements are allocated directly by the
system memory allocator.

recv_parse_log_recs(): Remove the parameter available_memory.

We rename some variables 'store_to_hash' to 'store', because
recv_sys.pages is not actually a hash table.

This is joint work with Thirunarayanan Balathandayuthapani.
2020-01-29 12:53:39 +02:00
Alexander Barkov
f1e13fdc8d MDEV-21581 Helper functions and methods for CHARSET_INFO 2020-01-28 12:29:23 +04:00
Eugene Kosov
b534a6675c cleanup redo log
class log_file_t: more or less sane RAII wrapper around redo log file
descriptor and its path.

This change is motivated by the need of using that log_file_t somewhere else.
2020-01-24 23:27:38 +08:00
Eugene Kosov
34dafb7e3a redo log mics fixes
os_file_flush_data_func(): fix builds on POSIX OSs where fdatasync()
is not avaiable

log_t::files::flush_data_only(): rename from fdatasync()

log_t::files::fsync(): removed and replaced with flush_data_only().
It will flush everything we need for using redo log files.
2020-01-23 22:46:43 +08:00
Eugene Kosov
700e010309 fix aligned memcpy()-like functions usage
I found that memcpy_aligned was used incorrectly at redo log and decided to put
assertions in aligned functions. And found even more incorrect cases.

Given the amount discovered of bugs, I left assertions to prevent future bugs.

my_assume_aligned(): instead of MY_ASSUME_ALIGNED macro
2020-01-23 00:12:43 +08:00
Marko Mäkelä
ded128aa9b Merge 10.4 into 10.5 2020-01-20 16:48:56 +02:00
Marko Mäkelä
87a61355e8 Merge 10.3 into 10.4
The MDEV-17062 fix in commit c4195305b2
was omitted.
2020-01-20 15:49:48 +02:00
Eugene Kosov
e9de6386ad MDEV-18115 remove now unneeded constraint
log_group_max_size: is not needed because redo log do not use fil_io() now
2020-01-18 23:42:55 +08:00
Marko Mäkelä
6373ec3ec7 Merge 10.2 into 10.3 2020-01-18 16:56:16 +02:00
Marko Mäkelä
02af6278fb InnoDB 5.6.47 and XtraDB 5.6.46-86.2
The only change is a change of the version number.
In MySQL 5.6.46, the copyright comments in a number of files were changed
in mysql/mysql-server@f1a006ece7
but there was no functional change to InnoDB code.
This was also reflected by XtraDB. We are not changing the copyright
comments in MariaDB Server for now.

Between MySQL 5.6.46 and 5.6.47, InnoDB was not changed at all.

Actually, we had forgotten to update the InnoDB version number to
5.6.46. With this change, we are updating InnoDB
from 5.6.45 to 5.6.47 and XtraDB from 5.6.45-86.1 to 5.6.46-86.2.
2020-01-17 17:39:20 +02:00
Marko Mäkelä
7b70cbd838 MDEV-21499 Merge new release of InnoDB 5.7.29 to 10.2 2020-01-17 16:24:40 +02:00
Marko Mäkelä
457ce97ef2 MDEV-21512 InnoDB may hang due to SPATIAL INDEX
MySQL 5.7.29 includes the following fix:
Bug #30287668 INNODB: A LONG SEMAPHORE WAIT
mysql/mysql-server@5cdbb22b51

There is no test case. It seems that the problem could occur when
a spatial index is large and peculiar enough so that multiple R-tree
leaf pages will have the exactly same maximum bounding rectangle (MBR).

The commit message suggests that the hang can occur when R-tree
non-leaf pages are being merged, which should only be possible
during transaction rollback or the purge of transaction history,
when the R-tree index is at least 2 levels high and very many records
are being deleted. The message says that a comparison result that two
spatial index node pointer records are equal will cause an infinite loop
in rtr_page_copy_rec_list_end_no_locks(). Hence, we must include the
child page number in the comparison to be consistent with
mysql/mysql-server@2e11fe0e15.

We fix this bug in a simpler way, involving fewer code changes.

cmp_rec_rec(): Renamed from cmp_rec_rec_with_match().
Assert that rec2 always resides in an index page.
Treat non-leaf spatial index pages specially.
2020-01-17 14:27:29 +02:00
Marko Mäkelä
c3695b4058 MDEV-21511: Remove unnecessary code
Now that we will be invoking dtuple_get_n_ext() instead of
letting btr_push_update_extern_fields() update an already
calculated value, it is unnecessary to calculate the n_ext
upfront.

row_rec_to_index_entry(), row_rec_to_index_entry_low():
Remove the output parameter n_ext.
2020-01-17 14:27:29 +02:00
Marko Mäkelä
5838b52743 MDEV-21511 Wrong estimate of affected BLOB columns in update
During update, rollback, or MVCC read, we may miscalculate
the number of off-page columns, and thus the size of the
clustered index record. The function btr_push_update_extern_fields()
is mostly redundant, because the off-page columns would also be
moved by row_upd_index_replace_new_col_val(), which is invoked
via row_upd_index_replace_new_col_vals().

btr_push_update_extern_fields(): Remove.

This is based on
mysql/mysql-server@1fa475b85d
which refines a fix for a recovery bug fix
mysql/mysql-server@ce0a1e85e2
in MySQL 5.7.5.

No test case was provided by Oracle.
Some of the changed code is being covered by the existing test
innodb.blob-crash.
2020-01-17 14:27:28 +02:00
Marko Mäkelä
3e38d15585 MDEV-21509 Possible hang during purge of history, or rollback
WL#6326 in MariaDB 10.2.2 introduced a potential hang on purge or rollback
when an index tree is being shrunk by multiple levels.

This fix is based on
mysql/mysql-server@f2c5852630
with the main difference that our version of the test case uses
DEBUG_SYNC instrumentation on ROLLBACK, not on purge.

btr_cur_will_modify_tree(): Simplify the check further.
This is the actual bug fix.

row_undo_mod_remove_clust_low(), row_undo_mod_clust(): Add DEBUG_SYNC
instrumentation for the test case.
2020-01-17 14:27:28 +02:00
Sergei Golubchik
e7558d4760 fix compilation w/o perfschema
followup for 3a3605f4b1
2020-01-16 18:13:55 +01:00
Marko Mäkelä
cc3135cf83 Fix a typo in a comment 2020-01-14 12:31:17 +02:00
Marko Mäkelä
90002480e9 MDEV-18115: Remove OS_AIO_LOG and IORequest::LOG 2020-01-10 19:08:46 +02:00
Eugene Kosov
3a3605f4b1 MDEV-21382 fix compilation without perfschema plugin 2020-01-09 16:57:15 +07:00
Marko Mäkelä
c62efb083c MDEV-21174: Remove a bogus comment 2020-01-08 18:23:55 +02:00
Eugene Kosov
cba9ed1279 fix compilation 2020-01-08 20:18:50 +07:00
Marko Mäkelä
68fe5f534c Merge 10.4 into 10.5 2020-01-07 14:10:15 +02:00
Marko Mäkelä
d60dcabd0f Merge 10.3 into 10.4 2020-01-07 13:23:41 +02:00
Marko Mäkelä
eda719793a Merge 10.2 into 10.3 2020-01-07 12:14:35 +02:00
Marko Mäkelä
82187a1221 MDEV-21429 TRUNCATE and OPTIMIZE are being refused due to "row size too large"
By default (innodb_strict_mode=ON), InnoDB attempts to guarantee
at DDL time that any INSERT to the table can succeed.
MDEV-19292 recently revised the "row size too large" check in InnoDB.
The check still is somewhat inaccurate;
that should be addressed in MDEV-20194.

Note: If a table contains multiple long string columns so that each column
is part of a column prefix index, then an UPDATE that attempts to modify
all those columns at once may fail, because the undo log record might
not fit in a single undo log page (of innodb_page_size). In the worst case,
the undo log record would grow by about 3KiB of for each updated column.

The DDL-time check (since the InnoDB Plugin for MySQL 5.1) is optional
in the sense that when the maximum B-tree record size or undo log
record size would be exceeded, the DML operation will fail and the
transaction will be properly rolled back.

create_table_info_t::row_size_is_acceptable(): Add the parameter
'bool strict' so that innodb_strict_mode=ON can be overridden during
TRUNCATE, OPTIMIZE and ALTER TABLE...FORCE (when the storage format
is not changing).

create_table_info_t::create_table(): Perform a sloppy check for
TRUNCATE TABLE (create_fk=false).

prepare_inplace_alter_table_dict(): Perform a sloppy check for
simple operations.

trx_is_strict(): Remove. The function became unused in
commit 98694ab0cb (MDEV-20949).
2020-01-07 11:02:12 +02:00
Eugene Kosov
6f2e228529 MDEV-21382 use fdatasync() for redo log where appropriate
log_t::files::fdatasync(): syncs only data for every log file

os_file_flush_data()
pfs_os_file_flush_data_func(): syncs only data for a given file
2020-01-05 16:08:44 +07:00
Eugene Kosov
fd899b3bbd Lets add another intrusive double linked list!
Features:
* STL-like interface
* Fast modification: no branches on insertion or deletion
* Fast iteration: one pointer dereference and one pointer comparison
* Your class can be a part of several lists

Modeled after std::list<T> but currently has fewer methods (not complete yet)

For even more performance it's possible to customize list with templates so
it won't have size counter variable or won't NULLify unlinked node.

How existing lists differ?

No existing lists support STL-like interface.

I_List:
* slower iteration (one more branch on iteration)
* element can't be a part of two lists simultaneously

I_P_List:
* slower modification (branches, except for the fastest push_back() case)
* slower iteration (one more branch on iteration)

UT_LIST_BASE_NODE_T:
* slower modification (branches)

Three UT_LISTs were replaced: two in fil_system_t and one in dyn_buf_t.
2020-01-04 13:39:14 +07:00
Marko Mäkelä
9949ab9393 MDEV-12353 preparation: Cleanup MLOG_FILE_NAME logging
mtr_t::commit_files(): Renamed from mtr_t::commit_checkpoint().
Remove the redundant bool parameter, and instead use checkpoint_lsn=0
to indicate that no checkpoint marker should be written.
2020-01-02 11:15:04 +02:00
Eugene Kosov
562c037b48 MDEV-18115 Remove dummy tablespace for the redo log
Redo log subsystem was decoupled from tablespace subsystem. It now manages file
descriptors for redo log files by itself.

FIL_TYPE_LOG: removed, code in various places was simplified

SRV_LOG_SPACE_FIRST_ID: renamed to SRV_SPACE_ID_UPPER_BOUND
  to better match its purpose. Code in various places was simplified

fil_n_log_flushes: replaced with log_sys::flushes
fil_n_pending_log_flushes: replaced with log_sys::pending_flushes

log_t::files::files: redo log file descriptors
log_t::files::file_names: redo log file names

log_t::files::set_file_names(): set file names without opening them
log_t::files::open_files(): opens redo log files
log_t::files::read(): treats several files as one big
log_t::files::write(): treats several files as one big
log_t::files::fsync(): flushes page cache to disk
log_t::files::close_files(): closes redo log files

fil_open_log_and_system_tablespace_files(): renamed to
  fil_open_system_tablespace_files()
  and obviously it now doesn't open redo log files

global files[1000]: removed. Why it was needed at all?
2020-01-01 22:09:51 +08:00
Marko Mäkelä
3fa4a9e6be Merge 10.4 into 10.5 2019-12-30 10:29:43 +02:00
Marko Mäkelä
ffc0a08d05 Merge 10.3 into 10.4 2019-12-30 10:27:59 +02:00
Nikita Malyavin
4923604ee2 MDEV-18865 Assertion `t->first->versioned_by_id()' failed in innodb_prepare_commit_versioned
Cause:
* row_start != 0 treated as it exists. Probably, possible row permutations had not been taken in mind.

Solution:
* Checking both row_start and row_end is correct, so versioned() function is used
2019-12-29 12:16:04 +02:00
Marko Mäkelä
b36154a109 Cleanup log_rec_t 2019-12-27 21:20:03 +02:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Marko Mäkelä
4c25e75ce7 Merge 10.3 into 10.4 2019-12-27 18:20:28 +02:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Marko Mäkelä
16bce0f6fe Cleanup: Remove dict_delete_tablespace_and_datafiles()
The function was only called by innobase_drop_tablespace(),
which was removed in commit 494e4b99a4
and added in commit 2e814d4702.
2019-12-27 11:23:28 +02:00
Alexander Barkov
4c57ab34d4 Merge remote-tracking branch 'origin/10.3' into 10.4 2019-12-25 13:33:28 +04:00
Thirunarayanan Balathandayuthapani
90ba87cb9e MDEV-19176 Reduce the memory usage during recovery
- post-push to fix the compilation issue
2019-12-23 15:56:57 +05:30
Thirunarayanan Balathandayuthapani
bba59abb03 MDEV-19176 Reduce the memory usage during recovery
- Moved the recv_sys->heap memory condition inside recv_parse_log_recs().
So that, InnoDB can mark the status as STORE_NO earlier.

- InnoDB uses one third of buffer pool chunk size for reading the redo
log records. In that case, we can avoid the scenario where buffer ran
out of memory issue during recovery.
2019-12-23 15:51:02 +05:30
Marko Mäkelä
73985d8301 Merge 10.1 into 10.2 2019-12-23 07:14:51 +02:00