Commit graph

3012 commits

Author SHA1 Message Date
Marko Mäkelä
82c465f68e MDEV-21962: Minor cleanup
Remove unnecessary buf_pool_t:: qualifiers. In comments,
replace buf_pool::mutex with buf_pool.mutex.

Remove an outdated comment about a planned buffer pool resizing feature.
It is already implemented in MariaDB 10.2.2 (and MySQL 5.7.9).
2020-03-23 10:50:30 +02:00
Marko Mäkelä
3b25083785 Merge 10.4 into 10.5 2020-03-23 10:50:14 +02:00
Thirunarayanan Balathandayuthapani
f7599f4799 MDEV-21792 Server aborts upon attempt to create foreign key on spatial field
- mbmaxlen is always 0 for binary field. Ignore the assert of checking
field->prefix_len % field->col->mbmaxlen == 0.
2020-03-23 13:18:00 +05:30
Sergey Vojtovich
81f700015e Cleanup my_atomic.h includes
my_atomic.h is included indirectly anyways.
2020-03-21 20:11:44 +04:00
Marko Mäkelä
5203bc10f1 Merge 10.4 into 10.5 2020-03-21 11:37:10 +02:00
Marko Mäkelä
bd3c8f47cd Merge 10.3 into 10.4 2020-03-20 22:06:55 +02:00
Marko Mäkelä
44298e4dea Merge 10.2 into 10.3
Also, clean up the test innodb_gis.geometry a little further.
2020-03-20 18:12:17 +02:00
Eugene Kosov
bb24fa31fa move my_assume_aligned() to a separate header 2020-03-20 18:47:35 +03:00
Marko Mäkelä
6960e9ed24 MDEV-21983: Crash on DROP/RENAME TABLE after DISCARD TABLESPACE
fil_delete_tablespace(): Remove the unused parameter drop_ahi,
and add the parameter if_exists=false. We want to suppress
error messages if we know that the tablespace has been discarded.

dict_table_rename_in_cache(): Pass the new parameter to
fil_delete_tablespace(), that is, do not complain about
missing tablespace if the tablespace has been discarded.

row_make_new_pathname(): Declare as static.

row_drop_table_for_mysql(): Tolerate !table->data_dir_path
when the tablespace has been discarded.

row_rename_table_for_mysql(): Skip part of the RENAME TABLE
when fil_space_get_first_path() returns NULL.
2020-03-19 14:23:47 +02:00
Marko Mäkelä
a786f50de5 MDEV-21962 Allocate buf_pool statically
Thanks to MDEV-15058, there is only one InnoDB buffer pool.
Allocating buf_pool statically removes one level of pointer indirection
and makes code more readable, and removes the awkward initialization of
some buf_pool members.

While doing this, we will also declare some buf_pool_t data members
private and replace some functions with member functions. This is
mostly affecting buffer pool resizing.

This is not aiming to be a complete rewrite of buf_pool_t to
a proper class. Most of the buffer pool interface, such as
buf_page_get_gen(), will remain in the C programming style
for now.

buf_pool_t::withdrawing: Replaces buf_pool_withdrawing.
buf_pool_t::withdraw_clock_: Replaces buf_withdraw_clock.

buf_pool_t::create(): Repalces buf_pool_init().
buf_pool_t::close(): Replaces buf_pool_free().

buf_bool_t::will_be_withdrawn(): Replaces buf_block_will_be_withdrawn(),
buf_frame_will_be_withdrawn().

buf_pool_t::clear_hash_index(): Replaces buf_pool_clear_hash_index().
buf_pool_t::get_n_pages(): Replaces buf_pool_get_n_pages().
buf_pool_t::validate(): Replaces buf_validate().
buf_pool_t::print(): Replaces buf_print().
buf_pool_t::block_from_ahi(): Replaces buf_block_from_ahi().
buf_pool_t::is_block_field(): Replaces buf_pointer_is_block_field().
buf_pool_t::is_block_mutex(): Replaces buf_pool_is_block_mutex().
buf_pool_t::is_block_lock(): Replaces buf_pool_is_block_lock().
buf_pool_t::is_obsolete(): Replaces buf_pool_is_obsolete().
buf_pool_t::io_buf: Make default-constructible.
buf_pool_t::io_buf::create(): Delayed 'constructor'
buf_pool_t::io_buf::close(): Early 'destructor'

HazardPointer: Make default-constructible. Define all member functions
inline, also for derived classes.
2020-03-18 22:32:40 +02:00
Thirunarayanan Balathandayuthapani
09e8707d90 MDEV-21826 Recovery failure : loop of Read redo log up to LSN
- This issue is caused by MDEV-19176
(bba59abb03).
- Problem is that there is miscalculation of available memory during
recovery if innodb_buffer_pool_instances > 1.
- Ignore the buffer pool instance while calculating available_memory
- Removed recv_n_pool_free_frames variable and use buf_pool_get_n_pages()
instead.
2020-03-18 15:25:28 +05:30
Marko Mäkelä
c7ba92372b Merge 10.4 into 10.5 2020-03-17 07:58:41 +02:00
Eugene Kosov
7dafab7569 MDEV-21949 key rotation for innodb_encrypt_log is not working in 10.5
log_t::has_encryption_key_rotation(): checks whether
key rotation is supported.

In a subsequent redo log format version, this key rotation
may be broken again.
2020-03-16 17:27:51 +03:00
Eugene Kosov
0c2365c4e3 cleanup redo log
Write log header just ones when file is created, instead of
writing to it on every log file wrap around.

log_t::file::write_header_durable(): this one writes to log header

log_write_buf(): this one stops writing to log header
2020-03-16 17:27:51 +03:00
Marko Mäkelä
e5e95a287e Merge 10.3 into 10.4 2020-03-16 16:24:36 +02:00
Eugene Kosov
774fe8969a cleanup redo log 2020-03-14 00:52:21 +03:00
Marko Mäkelä
5fe87ac413 Merge 10.2 into 10.3 2020-03-13 12:31:55 +02:00
Marko Mäkelä
fbe662a503 MDEV-15058: Remove buf_pool_get_dirty_pages_count()
Starting with commit 1a6f708ec5
the function buf_pool_get_dirty_pages_count() is only used
in a debug check. It was dead code for non-debug builds.

buf_flush_dirty_pages(): Perform the debug check inline,
and replace the assertion
	ut_ad(first || buf_pool_get_dirty_pages_count(id) == 0);
with another one that is executed while holding the mutexes:
	ut_ad(id != bpage->id.space());
2020-03-13 10:09:15 +02:00
Marko Mäkelä
2e8b0c56a0 MDEV-21933 INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES accesses SYS_DATAFILES
All tablespace metadata is buffered in fil_system. There is a LRU
mechanism, but that only controls the opening and closing of
fil_node_t::handle.

It is much more efficient and less error-prone to access data file names
by looking up the fil_space_t object rather than by essentially joining
each row with an access to SYS_DATAFILES via the InnoDB internal SQL parser.

dict_get_first_path(): Declare static. The function may only be needed
when loading or updating the data dictionary. Also, change a condition
in order to avoid a bogus GCC 10 -Wstringop-overflow warning for
mem_strdupl() about len==ULINT_UNDEFINED.

i_s_sys_tablespaces_fill_table(): Do not access other InnoDB internal
dictionary tables than SYS_TABLESPACES.
2020-03-13 08:07:02 +02:00
Marko Mäkelä
32904dc5fa Merge 10.1 into 10.2 2020-03-13 07:20:36 +02:00
Marko Mäkelä
f224525204 MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.

Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.

GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int.  Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.

In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.

The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.

This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
Marko Mäkelä
c7920fa8ff MDEV-16264: Eliminate unsafe os_aio_userdata_t type cast 2020-03-12 19:43:45 +02:00
Marko Mäkelä
8be3794b42 MDEV-21924 Clean up InnoDB GIS record comparison
The extension of the record comparison functions for SPATIAL INDEX in
mysql/mysql-server@b66ad511b6
was suboptimal for multiple reasons:

Some functions used unnecessary temporary variables of the int type,
instead of the more appropriate size_t, causing type mismatch.

Many functions unnecessarily required rec_get_offsets() to be
computed, or a parameter for length, although the size of the
minimum bounding rectangle (MBR) is hard-coded as
SPDIMS * 2 * sizeof(double), or 32 bytes.

In InnoDB SPATIAL INDEX records, there always is a 32-byte key
followed by either a 4-byte child page number or the PRIMARY KEY value.

The length parameters were not properly validated.
The function cmp_geometry_field() was making an incorrect attempt
at checking that the lengths are at least sizeof(double) (8 bytes),
even though the function is accessing up to 32 bytes in both MBR.

Functions that are called from only one compilation unit are defined
in another compilation unit, making the code harder to follow and
potentially slower to execute.

cmp_dtuple_rec_with_gis(): FIXME: Correct the debug assertion
and possibly the function TABLE_SHARE::init_from_binary_frm_image()
or related code, which causes an unexpected length of
DATA_MBR_LEN + 2 bytes to be passed to this function.
2020-03-12 18:13:53 +02:00
Eugene Kosov
df88e7cefa fix typedef-related warning and cleanup using namespace std 2020-03-11 16:27:37 +03:00
Marko Mäkelä
574d8b2940 MDEV-21907: Fix most clang -Wconversion in InnoDB
Declare innodb_purge_threads as 4-byte integer (UINT)
instead of 4-or-8-byte (ULONG) and adjust the documentation string.
2020-03-11 08:29:48 +02:00
Sergei Golubchik
7af733a5a2 perfschema compilation, test and misc fixes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
6ded554fc2 perfschema thread instrumentation related changes 2020-03-10 19:24:23 +01:00
Sergei Golubchik
22b6d8487a perfschema file instrumentation related changes 2020-03-10 19:24:22 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Sergei Golubchik
2ac3121af2 perfschema - various collateral cleanups and small changes 2020-03-10 19:24:22 +01:00
Marko Mäkelä
e2e2f89303 MDEV-15528: Minor cleanup
buf_flush_freed_page(): Reformat in the common style, and
simplify some code. Prefer to request all information from
smaller data structures (buf_page_t) than from fil_space_t
or the global variable srv_immediate_scrub_data_uncompressed.

SysTablespace::open_or_create(): Assert that the temporary
tablespace will not be created in page_compressed format, so that
buf_flush_freed_page() can avoid checking that on every call.

IORequest: Remove duplicated constructors, and do not explicitly
declare a default constructor.
2020-03-10 09:53:23 +01:00
Thirunarayanan Balathandayuthapani
a5584b13d1 MDEV-15528 Punch holes when pages are freed
The following parameters are deprecated:

  innodb-background-scrub-data-uncompressed
  innodb-background-scrub-data-compressed
  innodb-background-scrub-data-interval
  innodb-background-scrub-data-check-interval

Removed scrubbing code completely(btr0scrub.h, btr0scrub.cc)
Removed information_schema.innodb_tablespaces_scrubbing tables
Removed the scrubbing logic from fil_crypt_thread()
2020-03-10 10:51:08 +05:30
Thirunarayanan Balathandayuthapani
a35b4ae898 MDEV-15528 Punch holes when pages are freed
When a InnoDB data file page is freed, its contents becomes garbage,
and any storage allocated in the data file is wasted. During flushing,
InnoDB initializes the page with zeros if scrubbing is enabled. If the
tablespace is compressed then InnoDB should punch a hole else ignore the
flushing of the freed page.

buf_page_t:
- Replaced the variable file_page_was_freed, init_on_flush in buf_page_t
with status enum variable.
- Changed all debug assert of file_page_was_freed to DBUG_ASSERT
of buf_page_t::status

Removed buf_page_set_file_page_was_freed(),
buf_page_reset_file_page_was_freed().

buf_page_free(): Newly added function which takes X-lock on the page
before marking the status as FREED. So that InnoDB flush handler can
avoid concurrent flush of the freed page. Also while flushing the page,
InnoDB make sure that redo log which does freeing of the page also written
to the disk. Currently, this function only marks the page as FREED if
it is in buffer pool

buf_flush_freed_page(): Newly added function which initializes zeros
asynchorously if innodb_immediate_scrub_data_uncompressed is enabled.
Punch a hole to the file synchorously if page_compressed is enabled.
Reset the io_fix to NORMAL. Release the block from flush list and
associated mutex before writing zeros or punch a hole to the file.

buf_flush_page(): Removed the unnecessary usage of temporary
variable "flush"

fil_io(): Introduce new parameter called punch_hole. It allows fil_io()
to punch the hole to the file for the given offset.

buf_page_create(): Let the callers assign buf_page_t::status.
Every caller should eventually invoke mtr_t::init().

fsp_page_create(): Remove the unused mtr_t parameter.

In all other callers of buf_page_create() except fsp_page_create(),
before invoking mtr_t::init(), invoke
mtr_t::sx_latch_at_savepoint() or mtr_t::x_latch_at_savepoint().

mtr_t::init(): Initialize buf_page_t::status also for the temporary
tablespace (when redo logging is disabled), to avoid assertion failures.
2020-03-10 10:51:08 +05:30
Marko Mäkelä
57c592f74d Cleanup: Remove recv_sys.remove_extra_log_files
create_log_file(): Delete all old redo log files where they used to be
deleted, after the crash injection point innodb_log_abort_6,
before commit 9ef2d29ff4
deprecated and ignored the setting innodb_log_files_in_group.
2020-03-07 14:47:15 +02:00
Marko Mäkelä
70f0dbe4d3 Cleanup: log upgrade and encryption
log_crypt_101_read_checkpoint(), log_crypt_101_read_block():
Declare as ATTRIBUTE_COLD. These are only used when
checking that a MariaDB 10.1 encrypted redo log is clean.

log_block_calc_checksum_format_0(): Define in the only
compilation unit where it is needed. This is only used
when reading the checkpoint information from redo logs
before MariaDB 10.2.2.

crypt_info_t: Declare the byte arrays directly with alignas().

log_crypt(): Use memcpy_aligned instead of reinterpret_cast
on integers.
2020-03-07 14:31:36 +02:00
Marko Mäkelä
522fbfcb5c Cleanup: Remove recv_sys.buf_size
Also, correctly document what recv_sys.mutex is protecting.
2020-03-07 12:01:12 +02:00
Marko Mäkelä
a4ab54d70f MDEV-14425 Cleanup: Use std::atomic for some log_sys members
Some fields were protected by log_sys.mutex, which adds quite some
overhead for readers. Some readers were submitting dirty reads.

log_t::lsn: Declare private and atomic. Add wrappers get_lsn()
and set_lsn() that will use relaxed memory access. Many accesses
to log_sys.lsn are still protected by log_sys.mutex; we avoid the
mutex for some readers.

log_t::flushed_to_disk_lsn: Declare private and atomic, and move
to the same cache line with log_t::lsn.

log_t::buf_free: Declare as size_t, and move to the same cache line
with log_t::lsn.

log_t::check_flush_or_checkpoint_: Declare private and atomic,
and move to the same cache line with log_t::lsn.

log_get_lsn(): Define as an alias of log_sys.get_lsn().

log_get_lsn_nowait(), log_peek_lsn(): Remove.

log_get_flush_lsn(): Define as an alias of log_sys.get_flush_lsn().

log_t::initiate_write(): Replaces log_buffer_sync_in_background().
2020-03-05 16:21:31 +02:00
Eugene Kosov
555f955a16 use O_DSYNC for InnoDB
O_DSYNC is faster than O_SYNC because it syncs as little as needed
(e.g. no timestamp changes)

This change is similar to change fsync() -> fdatasync() in MDEV-21382
2020-03-05 11:35:09 +03:00
Marko Mäkelä
4b42fa3ce3 MDEV-14425: Remove the unused function mtr_write_log()
This amends commit 37e7bde12a
2020-03-05 08:51:40 +02:00
Marko Mäkelä
1312b4ebb6 MDEV-14425 preparation: Provide ut_crc32_low()
The ut_crc32() function uses a hard-coded initial CRC-32C value of 0.
Replace it with ut_crc32_low(), which allows to specify the initial
checksum value, and provide an inlined compatibility wrapper ut_crc32().

Also, remove non-inlined wrapper functions on ARMv8 and POWER8,
and remove dead code (the generic implementation) on POWER8.

Note: The original AMD64 instruction set architecture in 2003 only
included SSE2. The CRC-32C instructions are part of the SSE4.2
instruction set extension for IA-32 and AMD64, with first processors
released in November 2007 (using the AMD Barcelona microarchitecture)
and November 2008 (Intel Nehalem microarchiteture). It might be safe
to assume that SSE4.2 is available on all currently used AMD64 based
systems, but we are not taking that step yet.
2020-03-05 07:39:04 +02:00
Marko Mäkelä
64be4ab4a8 MDEV-21870 Deprecate and ignore innodb_scrub_log and innodb_scrub_log_speed
The configuration parameter innodb_scrub_log never really worked, as
reported in MDEV-13019 and MDEV-18370.

Because MDEV-14425 is changing the redo log format, the innodb_scrub_log
feature would have to be adjusted for it. Due to the known problems,
it is easier to remove the feature for now, and to ignore and deprecate
the parameters.

If old log contents should be kept secret, then enabling innodb_encrypt_log
or setting a smaller innodb_log_file_size could help.
2020-03-04 19:01:09 +02:00
Marko Mäkelä
9e488653ae Cleanup: Make MONITOR_LSN_CHECKPOINT_AGE a value.
Compute MONITOR_LSN_CHECKPOINT_AGE on demand in
srv_mon_process_existing_counter().
This allows us to remove the overhead of MONITOR_SET
calls for the counter.
2020-03-04 12:59:20 +02:00
Marko Mäkelä
4383897a01 MDEV-14425 preparation: Remove log_header_read()
The function log_header_read() was only used during server startup,
and it will mostly be used only for reading checkpoint information
from pre-MDEV-14425 format redo log files.

Let us replace the function with more direct calls, so that
it is clearer what is going on. It is not strictly necessary to
hold any mutex during this operation, and because there will be
only a limited number of operations during early server startup,
it is not necessary to increment any I/O counters.
2020-03-04 10:08:33 +02:00
Marko Mäkelä
37e7bde12a MDEV-14425 preparation: Remove log_t::append_on_checkpoint
Simplify the logging of ALTER TABLE operations, by making use of the
TRX_UNDO_RENAME_TABLE undo log record that was introduced in
commit 0bc36758ba.

commit_try_rebuild(): Invoke row_rename_table_for_mysql() and
actually rename the files before committing the transaction.

fil_mtr_rename_log(), commit_cache_rebuild(),
log_append_on_checkpoint(), row_merge_rename_tables_dict(): Remove.

mtr_buf_copy_t, log_t::append_on_checkpoint: Remove.

row_rename_table_for_mysql(): If !use_fk, ignore missing foreign
keys. Remove a call to dict_table_rename_in_cache(), because
trx_rollback_to_savepoint() should invoke the function if needed.
2020-03-03 22:25:20 +02:00
Marko Mäkelä
fae259f036 MDEV-12353: Introduce an EXTENDED record subtype TRIM_PAGES
For undo log truncation, commit 055a3334ad
repurposed the MLOG_FILE_CREATE2 record with a nonzero page size
to indicate that an undo tablespace will be shrunk in size.
In commit 7ae21b18a6 the
MLOG_FILE_CREATE2 record was replaced by a FILE_CREATE record.

Now that the redo log encoding was changed, there is no actual need
to write a file name in the log record; it suffices to write the
page identifier of the first page that is not part of the file.

This TRIM_PAGES record could allow us to shrink any data files in the
future. For now, it will be limited to undo tablespaces.

mtr_t::log_file_op(): Remove the parameter first_page_no, because
it would always be 0 for file operations.

mtr_t::trim_pages(): Replaces fil_truncate_log().

mtr_t::log_write(): Avoid same_page encoding if !bpage&&!m_last.

fil_op_replay_rename(): Remove the constant parameter first_page_no=0.
2020-03-03 13:25:45 +02:00
Marko Mäkelä
8511f04fdb Cleanup: Remove srv_start_lsn
Most of the time, we can refer to recv_sys.recovered_lsn.
2020-03-02 15:01:46 +02:00
Marko Mäkelä
55a5b5baf6 MDEV-12353 cleanup: Simplify mtr_t::undo_append() 2020-03-02 10:07:01 +02:00
Vladislav Vaintroub
30ea63b7d2 MDEV-21534 - Improve innodb redo log group commit performance
Introduce special synchronization primitive  group_commit_lock
for more efficient synchronization of redo log writing and flushing.

The goal is to reduce CPU consumption on log_write_up_to, to reduce
the spurious wakeups, and improve the throughput in write-intensive
benchmarks.
2020-03-01 19:02:21 +01:00
Marko Mäkelä
138cbec5f2 MDEV-21724: Optimize page_cur_insert_low() redo logging
Inserting a record into an index page involves updating multiple
fields in the page header as well as updating the next-record links
and potentially updating fields related to the sparse page directory.

Let us cover the insert operations by higher-level log records, to avoid
'redundant' logging about the writes.

The code for applying the high-level log records will check the
consistency of the page thoroughly, to avoid crashes during recovery.
We will refuse to replay the inserts if any inconsistency is detected.
With innodb_force_recovery=1, recovery will continue, but the affected
pages may be more inconsistent if some changes were omitted.

mrec_ext_t: Introduce the EXTENDED record subtypes
INSERT_HEAP_REDUNDANT, INSERT_REUSE_REDUNDANT,
INSERT_HEAP_DYNAMIC, INSERT_REUSE_DYNAMIC.
The record will explicitly identify the page type and whether
the space will be allocated from PAGE_HEAP_TOP or reused from
the PAGE_FREE list. It will also tell how many bytes to copy
from the preceding record header and payload, and how to
initialize the rest of the record header and payload.

mtr_t::page_insert(): Write the high-level log records.

log_phys_t::apply(): Parse the high-level log records.

page_apply_insert_redundant(), page_apply_insert_dynamic():
Apply the high-level log records.

page_dir_split_slot(): Introduce a variant that does not write log
nor deal with ROW_FORMAT=COMPRESSED pages.

page_mem_alloc_heap(): Remove the mtr_t parameter

page_cur_insert_rec_low(): Write log only via mtr_t::page_insert().
2020-02-27 17:19:44 +02:00
Marko Mäkelä
dee6fb356b MDEV-12353 Cleanup: Remove page_rec_get_base_extra_size()
The function page_rec_get_base_extra_size() became dead code in
commit 08ba388713.
2020-02-27 17:15:20 +02:00