Commit graph

2362 commits

Author SHA1 Message Date
Marko Mäkelä
4383897a01 MDEV-14425 preparation: Remove log_header_read()
The function log_header_read() was only used during server startup,
and it will mostly be used only for reading checkpoint information
from pre-MDEV-14425 format redo log files.

Let us replace the function with more direct calls, so that
it is clearer what is going on. It is not strictly necessary to
hold any mutex during this operation, and because there will be
only a limited number of operations during early server startup,
it is not necessary to increment any I/O counters.
2020-03-04 10:08:33 +02:00
Marko Mäkelä
8511f04fdb Cleanup: Remove srv_start_lsn
Most of the time, we can refer to recv_sys.recovered_lsn.
2020-03-02 15:01:46 +02:00
Eugene Kosov
9ef2d29ff4 MDEV-14425 deprecate and ignore innodb_log_files_in_group
Now there can be only one log file instead of several which
logically work as a single file.

Possible names of redo log files: ib_logfile0,
ib_logfile101 (for just created one)

innodb_log_fiels_in_group: value of this variable is not used
by InnoDB. Possible values are still 1..100, to not break upgrade

LOG_FILE_NAME: add constant of value "ib_logfile0"
LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile"

get_log_file_path(): convenience function that returns full
path of a redo log file

SRV_N_LOG_FILES_MAX: removed

srv_n_log_files: we can't remove this for compatibility reasons,
but now server doesn't use this variable

log_sys_t::file::fd: now just one, not std::vector

log_sys_t::log_capacity: removed word 'group'

find_and_check_log_file(): part of logic from huge srv_start()
moved here

recv_sys_t::files: file descriptors of redo log files.
There can be several of those in case we're upgrading
from older MariaDB version.

recv_sys_t::remove_extra_log_files: whether to remove
ib_logfile{1,2,3...} after successfull upgrade.

recv_sys_t::read(): open if needed and read from one
of several log files

recv_sys_t::files_size(): open if needed and return files count

redo_file_sizes_are_correct(): check that redo log files
sizes are equal. Just to log an error for a user.
Corresponding check was moved from srv0start.cc

namespace deprecated: put all deprecated variables here to
prevent usage of it by us, developers
2020-02-19 12:21:59 +03:00
Marko Mäkelä
f8a9f90667 MDEV-12353: Remove support for crash-upgrade
We tighten some assertions regarding dict_index_t::is_dummy
and crash recovery, now that redo log processing will
no longer create dummy objects.
2020-02-13 19:13:45 +02:00
Marko Mäkelä
7ae21b18a6 MDEV-12353: Change the redo log encoding
log_t::FORMAT_10_5: physical redo log format tag

log_phys_t: Buffered records in the physical format.
The log record bytes will follow the last data field,
making use of alignment padding that would otherwise be wasted.
If there are multiple records for the same page, also those
may be appended to an existing log_phys_t object if the memory
is available.

In the physical format, the first byte of a record identifies the
record and its length (up to 15 bytes). For longer records, the
immediately following bytes will encode the remaining length
in a variable-length encoding. Usually, a variable-length-encoded
page identifier will follow, followed by optional payload, whose
length is included in the initially encoded total record length.

When a mini-transaction is updating multiple fields in a page,
it can avoid repeating the tablespace identifier and page number
by setting the same_page flag (most significant bit) in the first
byte of the log record. The byte offset of the record will be
relative to where the previous record for that page ended.

Until MDEV-14425 introduces a separate file-level log for
redo log checkpoints and file operations, we will write the
file-level records in the page-level redo log file.
The record FILE_CHECKPOINT (which replaces MLOG_CHECKPOINT)
will be removed in MDEV-14425, and one sequential scan of the
page recovery log will suffice.

Compared to MLOG_FILE_CREATE2, FILE_CREATE will not include any flags.
If the information is needed, it can be parsed from WRITE records that
modify FSP_SPACE_FLAGS.

MLOG_ZIP_WRITE_STRING: Remove. The record was only introduced temporarily
as part of this work, before being replaced with WRITE (along with
MLOG_WRITE_STRING, MLOG_1BYTE, MLOG_nBYTES).

mtr_buf_t::empty(): Check if the buffer is empty.

mtr_t::m_n_log_recs: Remove. It suffices to check if m_log is empty.

mtr_t::m_last, mtr_t::m_last_offset: End of the latest m_log record,
for the same_page encoding.

page_recv_t::last_offset: Reflects mtr_t::m_last_offset.

Valid values for last_offset during recovery should be 0 or above 8.
(The first 8 bytes of a page are the checksum and the page number,
and neither are ever updated directly by log records.)
Internally, the special value 1 indicates that the same_page form
will not be allowed for the subsequent record.

mtr_t::page_create(): Take the block descriptor as parameter,
so that it can be compared to mtr_t::m_last. The INIT_INDEX_PAGE
record will always followed by a subtype byte, because same_page
records must be longer than 1 byte.

trx_undo_page_init(): Combine the writes in WRITE record.

trx_undo_header_create(): Write 4 bytes using a special MEMSET
record that includes 1 bytes of length and 2 bytes of payload.

flst_write_addr(): Define as a static function. Combine the writes.

flst_zero_both(): Replaces two flst_zero_addr() calls.

flst_init(): Do not inline the function.

fsp_free_seg_inode(): Zerofill the whole inode.

fsp_apply_init_file_page(): Initialize FIL_PAGE_PREV,FIL_PAGE_NEXT
to FIL_NULL when using the physical format.

btr_create(): Assert !page_has_siblings() because fsp_apply_init_file_page()
must have been invoked.

fil_ibd_create(): Do not write FILE_MODIFY after FILE_CREATE.

fil_names_dirty_and_write(): Remove the parameter mtr.
Write the records using a separate mini-transaction object,
because any FILE_ records must be at the start of a mini-transaction log.

recv_recover_page(): Add a fil_space_t* parameter.
After applying log to the a ROW_FORMAT=COMPRESSED page,
invoke buf_zip_decompress() to restore the uncompressed page.

buf_page_io_complete(): Remove the temporary hack to discard the
uncompressed page of a ROW_FORMAT=COMPRESSED page.

page_zip_write_header(): Remove. Use mtr_t::write() or
mtr_t::memset() instead, and update the compressed page frame
separately.

trx_undo_header_add_space_for_xid(): Remove.

trx_undo_seg_create(): Perform the changes that were previously
made by trx_undo_header_add_space_for_xid().

btr_reset_instant(): New function: Reset the table to MariaDB 10.2
or 10.3 format when rolling back an instant ALTER TABLE operation.

page_rec_find_owner_rec(): Merge with the only callers.

page_cur_insert_rec_low(): Combine writes by using a local buffer.
MEMMOVE data from the preceding record whenever feasible
(copying at least 3 bytes).

page_cur_insert_rec_zip(): Combine writes to page header fields.

PageBulk::insertPage(): Issue MEMMOVE records to copy a matching
part from the preceding record.

PageBulk::finishPage(): Combine the writes to the page header
and to the sparse page directory slots.

mtr_t::write(): Only log the least significant (last) bytes
of multi-byte fields that actually differ.

For updating FSP_SIZE, we must always write all 4 bytes to the
redo log, so that the fil_space_set_recv_size() logic in
recv_sys_t::parse() will work.

mtr_t::memcpy(), mtr_t::zmemcpy(): Take a pointer argument
instead of a numeric offset to the page frame. Only log the
last bytes of multi-byte fields that actually differ.

In fil_space_crypt_t::write_page0(), we must log also any
unchanged bytes, so that recovery will recognize the record
and invoke fil_crypt_parse().

Future work:
MDEV-21724 Optimize page_cur_insert_rec_low() redo logging
MDEV-21725 Optimize btr_page_reorganize_low() redo logging
MDEV-21727 Optimize redo logging for ROW_FORMAT=COMPRESSED
2020-02-13 19:12:17 +02:00
Marko Mäkelä
67c76704a8 MDEV-12353: Remove MLOG_INDEX_LOAD (innodb_log_optimize_ddl)
NOTE: This may break crash-upgrade from a dataset that was
created with innodb_log_optimize_ddl=ON. Also due to
ROW_FORMAT=COMPRESSED pages, it will be easiest to disallow
crash-upgrade.

It would be more robust to disable the MDEV-12699 logic when
crash-upgrading from old redo log format.

log_optimized_ddl_op: Remove.

fil_space_t::enable_lsn, file_name_t::enable_lsn: Remove.

ddl_tracker_t::optimized_ddl: Remove.

TODO: Remove ddl_tracker
2020-02-13 18:19:15 +02:00
Marko Mäkelä
1a6f708ec5 MDEV-15058: Deprecate and ignore innodb_buffer_pool_instances
Our benchmarking efforts indicate that the reasons for splitting the
buf_pool in commit c18084f71b
have mostly gone away, possibly as a result of
mysql/mysql-server@ce6109ebfd
or similar work.

Only in one write-heavy benchmark where the working set size is
ten times the buffer pool size, the buf_pool->mutex would be
less contended with 4 buffer pool instances than with 1 instance,
in buf_page_io_complete(). That contention could be alleviated
further by making more use of std::atomic and by splitting
buf_pool_t::mutex further (MDEV-15053).

We will deprecate and ignore the following parameters:

	innodb_buffer_pool_instances
	innodb_page_cleaners

There will be only one buffer pool and one page cleaner task.

In a number of INFORMATION_SCHEMA views, columns that indicated
the buffer pool instance will be removed:

	information_schema.innodb_buffer_page.pool_id
	information_schema.innodb_buffer_page_lru.pool_id
	information_schema.innodb_buffer_pool_stats.pool_id
	information_schema.innodb_cmpmem.buffer_pool_instance
	information_schema.innodb_cmpmem_reset.buffer_pool_instance
2020-02-12 14:45:21 +02:00
Eugene Kosov
691c691adc clean up redo log
main change: rename first redo log without file close

second change: use os_offset_t to represent offset in a file

third change: fix log texts
2020-02-01 23:58:24 +08:00
Marko Mäkelä
50324ce624 MDEV-21351 Replace recv_sys.heap with list of buf_block_t
InnoDB crash recovery used a special type of mem_heap_t that
allocates backing store from the buffer pool. That incurred
a significant overhead, leading to underutilization of memory,
and limiting the maximum contiguous allocated size of a log record.

recv_sys_t::blocks: A linked list of buf_block_t that are allocated
by buf_block_alloc() for redo log records. Replaces recv_sys_t::heap.
We repurpose buf_block_t::unzip_LRU for linking the elements.

recv_sys_t::max_log_blocks: Renamed from recv_n_pool_free_frames.

recv_sys_t::max_blocks(): Accessor for max_log_blocks.

recv_sys_t::alloc(): Allocate memory from the current recv_sys_t::blocks
element, or allocate another block.  In debug builds, various free()
member functions must be invoked, because we repurpose
buf_page_t::buf_fix_count for tracking allocations.

recv_sys_t::free_corrupted_page(): Renamed from recv_recover_corrupt_page()

recv_sys_t::is_memory_exhausted(): Renamed from recv_sys_heap_check()

recv_sys_t::pages and its elements are allocated directly by the
system memory allocator.

recv_parse_log_recs(): Remove the parameter available_memory.

We rename some variables 'store_to_hash' to 'store', because
recv_sys.pages is not actually a hash table.

This is joint work with Thirunarayanan Balathandayuthapani.
2020-01-29 12:53:39 +02:00
Marko Mäkelä
a983b24407 Merge 10.4 into 10.5 2020-01-28 14:17:09 +02:00
Oleksandr Byelkin
bfc24bb2ec Merge branch '10.3' into 10.4 2020-01-24 14:50:23 +01:00
Oleksandr Byelkin
ceda5f724f Merge branch '10.2' into 10.3 2020-01-24 14:16:20 +01:00
Oleksandr Byelkin
f2ccfcaca1 Merge branch '10.1' into 10.2 2020-01-24 13:46:49 +01:00
Julius Goryavsky
982294ac16 MDEV-17601: MariaDB Galera does not expect 'mbstream' as streamfmt
Setting "streamfmt=mbstream" in the "[sst]" section causes SST to fail
because the format automatically switches to 'tar' by default (insead
of mbstream).

To fix this, we need to add mbstream to the list of valid values for
the format, making it synonymous with xbstream. This must be done both
in the SST script and when parsing the options of the corresponding
utilities.
2020-01-21 10:50:48 +01:00
Vladislav Vaintroub
7c0e4748ac silence a warning in WolfSSL.
There is a warning about inconsistency between function definition
and prototype.

See https://github.com/wolfSSL/wolfssl/issues/2752

Disable specific MSVC warning for now.
2020-01-21 09:20:59 +01:00
Oleksandr Byelkin
3155a643df new wolfssl v4.3.0-stable 2020-01-20 16:31:50 +01:00
Eugene Kosov
e9de6386ad MDEV-18115 remove now unneeded constraint
log_group_max_size: is not needed because redo log do not use fil_io() now
2020-01-18 23:42:55 +08:00
Sergei Golubchik
ff5a528f26 mysqltest crashes on Debian
Debian is apparently offended that pcre2-posix implements POSIX API,
thus it renames all posix-compatible symbols in libpcre2-posix to have the
PCRE2 prefix. But Debian doesn't do anything to pcre2posix.h header,
so any unaware application will get POSIX compatible type names
and function prototypes from pcre2, but actual symbols will come
from libc.

To remedy this enormous incongruity we have to redefine POSIX-compatible
function names in pcre2posix to match Debian's hack.
2020-01-16 18:13:55 +01:00
Eugene Kosov
562c037b48 MDEV-18115 Remove dummy tablespace for the redo log
Redo log subsystem was decoupled from tablespace subsystem. It now manages file
descriptors for redo log files by itself.

FIL_TYPE_LOG: removed, code in various places was simplified

SRV_LOG_SPACE_FIRST_ID: renamed to SRV_SPACE_ID_UPPER_BOUND
  to better match its purpose. Code in various places was simplified

fil_n_log_flushes: replaced with log_sys::flushes
fil_n_pending_log_flushes: replaced with log_sys::pending_flushes

log_t::files::files: redo log file descriptors
log_t::files::file_names: redo log file names

log_t::files::set_file_names(): set file names without opening them
log_t::files::open_files(): opens redo log files
log_t::files::read(): treats several files as one big
log_t::files::write(): treats several files as one big
log_t::files::fsync(): flushes page cache to disk
log_t::files::close_files(): closes redo log files

fil_open_log_and_system_tablespace_files(): renamed to
  fil_open_system_tablespace_files()
  and obviously it now doesn't open redo log files

global files[1000]: removed. Why it was needed at all?
2020-01-01 22:09:51 +08:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Marko Mäkelä
4c25e75ce7 Merge 10.3 into 10.4 2019-12-27 18:20:28 +02:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Thirunarayanan Balathandayuthapani
bba59abb03 MDEV-19176 Reduce the memory usage during recovery
- Moved the recv_sys->heap memory condition inside recv_parse_log_recs().
So that, InnoDB can mark the status as STORE_NO earlier.

- InnoDB uses one third of buffer pool chunk size for reading the redo
log records. In that case, we can avoid the scenario where buffer ran
out of memory issue during recovery.
2019-12-23 15:51:02 +05:30
Sergei Golubchik
3b654d54c1 longer regex error messages 2019-12-21 10:34:02 +01:00
Alexey Botchkov
9dadfdcde5 MDEV-14024 PCRE2.
Related changes in the server code.
2019-12-21 10:34:02 +01:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Marko Mäkelä
0a20e5ab77 Merge 10.2 into 10.3 2019-12-12 14:41:51 +02:00
Vlad Lesin
beec9c0e19 MDEV-21255: Deadlock of parallel slave and mariabackup (with failed log
copy thread)

mariabackup hangs waiting until innodb redo log thread read log till certain
LSN, and it waits under FTWRL. If there is redo log read error in the thread,
it is finished, and main thread knows nothing about it, what leads to hanging.
As it hangs under FTWRL, slave threads on server side can be blocked due
to MDL lock conflict.

The fix is to finish mariabackup with error message on innodb redo log read
failure.
2019-12-12 13:28:30 +03:00
Vladislav Vaintroub
202a62deb0 MDEV-11345 Compile english error messages into mysqld executable.
Simplify loading messages into mariabackup. Do the same as server does
We're forcing english, so there is no attempt to load errmsg.sys
2019-12-11 16:15:40 +01:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Marko Mäkelä
42a4ae54c2 MDEV-21225 Remove ut_align() and use aligned_malloc()
Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.

It is cleaner to invoke aligned_malloc() and aligned_free() directly.

ut_align(): Remove. In assertions, ut_align_down() can be used instead.
2019-12-05 06:42:31 +02:00
Jan Lindström
9d9a2253c6 Merge remote-tracking branch 10.2 into 10.3
Conflicts:
	mysql-test/suite/galera/t/galera_binlog_event_max_size_max-master.opt
	mysql-test/suite/innodb/r/innodb-mdev-7513.result
	mysql-test/suite/innodb/t/innodb-mdev-7513.test
	mysql-test/suite/wsrep/disabled.def
	storage/innobase/ibuf/ibuf0ibuf.cc
2019-12-02 14:35:10 +02:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Vlad Lesin
bd11bd63cc MDEV-18310: Aria engine: Undo phase failed with "Got error 121 when
executing undo undo_key_delete" upon startup on datadir restored from
incremental backup

aria_log* files were not copied on --prepare --incremental-dir step from
incremental to destination backup directory.
2019-11-29 17:01:12 +03:00
Marko Mäkelä
312569e2fd MDEV-21132 Remove buf_page_t::newest_modification
At each mini-transaction commit, the log sequence number of the
mini-transaction must be written to each modified page, so that
it will be available in the FIL_PAGE_LSN field when the page is
being read in crash recovery.

InnoDB was unnecessarily allocating redundant storage for the
field, in buf_page_t::newest_modification. Let us access
FIL_PAGE_LSN directly.

Furthermore, on ALTER TABLE...IMPORT TABLESPACE, let us write
0 to FIL_PAGE_LSN instead of using log_sys.lsn.

buf_flush_init_for_writing(), buf_flush_update_zip_checksum(),
fil_encrypt_buf_for_full_crc32(), fil_encrypt_buf(),
fil_space_encrypt(): Remove the parameter lsn.

buf_page_get_newest_modification(): Merge with the only caller.

buf_tmp_reserve_compression_buf(), buf_tmp_page_encrypt(),
buf_page_encrypt(): Define static in the same compilation unit
with the only caller.

PageConverter::m_current_lsn: Remove. Write 0 to FIL_PAGE_LSN
on ALTER TABLE...IMPORT TABLESPACE.
2019-11-25 09:39:51 +02:00
Marko Mäkelä
a9846f3299 Merge 10.4 into 10.5 2019-11-19 10:45:28 +08:00
Vladislav Vaintroub
5e62b6a5e0 MDEV-16264 Use threadpool for Innodb background work.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.

- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling

- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.

- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.

- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.

- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
2019-11-15 18:09:30 +01:00
Vladislav Vaintroub
009674dc52 Fix a couple of clang-cl warnings 2019-11-15 15:39:31 +01:00
Vladislav Vaintroub
6df0bb7d38 MDEV-21062 Buildbot, Windows - sporadically missing lines from mtr's "exec"
Provide own version of popen/pclose, in attempt to workaround
sporadic erratic behavior of UCRT's one.
2019-11-15 15:39:31 +01:00
Marko Mäkelä
5ed54e78ac Cleanup: Remove redundant XDES_FREE_BIT parameters
The page allocation bitmaps in the extent descriptor pages
contain two bits per page: XDES_FREE_BIT and XDES_CLEAN_BIT,
which is unused. Simplify read access.

xdes_is_free(descr,mtr): Remove. Use !xdes_get_n_used(descr) instead.

xdes_is_free(): Replaces xdes_get_bit(), xdes_mtr_get_bit().

xdes_find_free(): Replaces xdes_find_bit().

fsp_seg_inode_page_get_nth_inode(): Remove the redundant parameters
physical_size, mtr.

fsp_seg_inode_page_find_used(), fsp_seg_inode_page_find_free():
Remove the redundant parameter mtr.
2019-11-08 13:45:02 +02:00
Marko Mäkelä
52246dff2c Merge 10.4 into 10.5 2019-11-08 09:43:41 +02:00
Marko Mäkelä
78d0d2cdc5 Cleanup: Remove mach_read_ulint()
The function mach_read_ulint() is a wrapper for the lower-level
functions mach_read_from_1(), mach_read_from_2(), mach_read_from_8().
Invoke those functions directly, for better readability of the code.

mtr_t::read_ulint(), mtr_read_ulint(): Remove. Yes, we will lose the
ability to assert that the read is covered by the mini-transaction.
We would still check that on writes, and any writes that
wrongly bypass mini-transaction logging would likely be caught by
stress testing with Mariabackup.
2019-11-08 09:41:06 +02:00
Marko Mäkelä
77e8a311e1 Merge 10.4 into 10.5
A conflict between MDEV-19514 (b42294bc64)
and MDEV-20934 (d7a2401750)
was resolved. We will not invoke the function ibuf_delete_recs()
from ibuf_merge_or_delete_for_page(). Instead, we will add that
logic to the function ibuf_read_merge_pages().
2019-11-07 10:34:33 +02:00
Marko Mäkelä
928abd6967 Merge 10.3 into 10.4 2019-11-06 13:44:56 +02:00
Marko Mäkelä
908ca4668d Merge 10.2 into 10.3 2019-11-06 13:14:31 +02:00
Marko Mäkelä
8688ef22c2 Merge 10.1 to 10.2 2019-11-06 10:18:51 +02:00
Marko Mäkelä
5164f8c206 Fix GCC 9.2.1 -Wstringop-truncation
dict_table_rename_in_cache(): Use strcpy() instead of strncpy(),
because they are known to be equivalent in this case (the length
of old_name was already validated).

mariabackup: Invoke strncpy() with one less than the buffer size,
and explicitly add NUL as the last byte of the buffer.
2019-11-04 15:52:54 +02:00
Oleksandr Byelkin
903f5fea30 Revert "wolfssl 4.2.0" (it is not ready jet)
This reverts commit dacd1794e4.
2019-11-02 18:54:01 +01:00
Oleksandr Byelkin
dacd1794e4 wolfssl 4.2.0 2019-11-02 12:11:39 +01:00