Commit graph

316 commits

Author SHA1 Message Date
Marko Mäkelä
acd265b69b MDEV-12353: Exclusively use page_zip_reorganize() for ROW_FORMAT=COMPRESSED
page_zip_reorganize(): Restore the page on failure.
In callers, omit now-redundant calls to page_zip_decompress().

btr_page_reorganize_low(): Define in static scope only, and
remove the z_level parameter. Assert that ROW_FORMAT is not COMPRESSED.

btr_page_reorganize_block(), btr_page_reorganize(): Invoke
page_zip_reorganize() for ROW_FORMAT=COMPRESSED.
2020-02-13 18:19:14 +02:00
Marko Mäkelä
5bea43f5e0 MDEV-12353: Deprecate and ignore innodb_log_compressed_pages
page_zip_compress_write_log_no_data(): Remove.
We no longer write the MLOG_ZIP_PAGE_COMPRESS_NO_DATA record.
Instead, we will write MLOG_ZIP_PAGE_COMPRESS records.
2020-02-13 18:19:13 +02:00
Marko Mäkelä
600eae9179 MDEV-12353: Remove MTR_LOG_SHORT_INSERTS
No longer emit the redo log records
MLOG_LIST_END_COPY_CREATED, MLOG_COMP_LIST_END_COPY_CREATED.
2020-02-13 18:19:13 +02:00
Marko Mäkelä
1a6f708ec5 MDEV-15058: Deprecate and ignore innodb_buffer_pool_instances
Our benchmarking efforts indicate that the reasons for splitting the
buf_pool in commit c18084f71b
have mostly gone away, possibly as a result of
mysql/mysql-server@ce6109ebfd
or similar work.

Only in one write-heavy benchmark where the working set size is
ten times the buffer pool size, the buf_pool->mutex would be
less contended with 4 buffer pool instances than with 1 instance,
in buf_page_io_complete(). That contention could be alleviated
further by making more use of std::atomic and by splitting
buf_pool_t::mutex further (MDEV-15053).

We will deprecate and ignore the following parameters:

	innodb_buffer_pool_instances
	innodb_page_cleaners

There will be only one buffer pool and one page cleaner task.

In a number of INFORMATION_SCHEMA views, columns that indicated
the buffer pool instance will be removed:

	information_schema.innodb_buffer_page.pool_id
	information_schema.innodb_buffer_page_lru.pool_id
	information_schema.innodb_buffer_pool_stats.pool_id
	information_schema.innodb_cmpmem.buffer_pool_instance
	information_schema.innodb_cmpmem_reset.buffer_pool_instance
2020-02-12 14:45:21 +02:00
Marko Mäkelä
fc2f2fa853 MDEV-19747: Deprecate and ignore innodb_log_optimize_ddl
During native table rebuild or index creation, InnoDB used to skip
redo logging and write MLOG_INDEX_LOAD records to inform crash recovery
and Mariabackup of the gaps in redo log. This is fragile and prohibits
some optimizations, such as skipping the doublewrite buffer for
newly (re)initialized pages (MDEV-19738).

row_merge_write_redo(): Remove. We do not write MLOG_INDEX_LOAD
records any more. Instead, we write full redo log.

FlushObserver: Remove.

fseg_free_page_func(): Remove the parameter log. Redo logging
cannot be disabled.

fil_space_t::redo_skipped_count: Remove.

We cannot remove buf_block_t::skip_flush_check, because PageBulk
will temporarily generate invalid B-tree pages in the buffer pool.
2020-02-11 18:44:26 +02:00
Marko Mäkelä
a983b24407 Merge 10.4 into 10.5 2020-01-28 14:17:09 +02:00
Oleksandr Byelkin
bfc24bb2ec Merge branch '10.3' into 10.4 2020-01-24 14:50:23 +01:00
Oleksandr Byelkin
ceda5f724f Merge branch '10.2' into 10.3 2020-01-24 14:16:20 +01:00
Marko Mäkelä
1d12bff42c MDEV-20775: page_zip_validate() failure due to AUTO_INCREMENT
cmake -DWITH_INNODB_EXTRA_DEBUG:BOOL=ON
was broken ever since commit 8777458a6e
(MDEV-6076 Persistent AUTO_INCREMENT for InnoDB).

There is a race condition between page reads that call
page_zip_validate() (while holding clustered index root page S-latch)
and writes that update PAGE_ROOT_AUTO_INC
(with buf_block_t::lock SX-latch, compatible with S-latch).

page_zip_validate_low(): Skip the PAGE_ROOT_AUTO_INC field on
clustered index root pages in order to avoid false positives.
2020-01-23 16:25:46 +02:00
Eugene Kosov
700e010309 fix aligned memcpy()-like functions usage
I found that memcpy_aligned was used incorrectly at redo log and decided to put
assertions in aligned functions. And found even more incorrect cases.

Given the amount discovered of bugs, I left assertions to prevent future bugs.

my_assume_aligned(): instead of MY_ASSUME_ALIGNED macro
2020-01-23 00:12:43 +08:00
Marko Mäkelä
c4de197a0a MDEV-21174: Optimize page_set_autoinc()
page_set_autoinc(): Check if any change would take place,
and omit the mtr_t::OPT parameter. This should avoid
unnecessarily redo log writes for ROW_FORMAT=COMPRESSED pages.
2020-01-08 14:08:00 +02:00
Eugene Kosov
496532b5c5 MDEV-20950: Fix 32-bit Windows build 2019-12-21 21:36:25 +02:00
Marko Mäkelä
ce70573f62 MDEV-21353 Assertion failures due to btr_pcur_restore_pos() misbehaving
In commit befde6e97e
a bogus condition was added, aiming to avoid debug assertion failures
during ALTER TABLE...IMPORT TABLESPACE.

page_cur_delete_rec(): Remove the bogus condition, and replace the
use of buf_block_modify_clock_inc() with a lower-level operation
that skips the debug checks that would fail during IMPORT.
2019-12-20 17:57:32 +02:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
745fd4b39f MDEV-21174: Remove some mlog_write_initial_log_record_fast()
Pass buf_block_t* to more functions that write redo log.

page_zip_write_node_ptr(), page_zip_write_blob_ptr(),
page_zip_compress_write_log_no_data():
Take buf_block_t* as parameter, and do not tolerate mtr=NULL.

page_zip_compress(): Do not tolerate mtr=NULL.

page_zip_dir_insert(): Take page_cur_t* as parameter.

mlog_write_initial_log_record(): Remove. This function was unused.

RecIterator::remove(): Remove the redundant page_zip parameter.

PageConverter::m_page_zip_ptr: Remove.
2019-12-13 18:15:51 +02:00
Marko Mäkelä
2b5a269cb4 MDEV-21174: Clean up record insertion
page_cur_insert_rec_low(): Take page_cur_t* as a parameter,
and do not tolerate mtr=NULL.

page_cur_insert_rec_zip(): Do not tolerate mtr=NULL.
2019-12-13 18:15:51 +02:00
Marko Mäkelä
befde6e97e MDEV-12353 preparation: Clean up page_cur_delete_rec()
page_cur_delete_rec(): Do not tolerate mtr=NULL.

page_delete_rec(): Merge with the only caller, RecIterator::remove().

RecIterator::m_mtr: New data member: a dummy mini-transaction.
2019-12-13 18:15:35 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Marko Mäkelä
3466b47b0d Merge 10.2 into 10.3 2019-12-13 10:08:57 +02:00
Eugene Kosov
f0aa073f2b MDEV-20950 Reduce size of record offsets
offset_t: this is a type which represents one record offset.
It's unsigned short int.

a lot of functions: replace ulint with offset_t

btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
  allocate record offsets on the stack instead of waiting for rec_get_offsets()
  to allocate it from mem_heap_t. So, reducing  memory allocations.

RECORD_OFFSET, INDEX_OFFSET:
  now it's less convenient to store pointers in offset_t*
  array. One pointer occupies now several offset_t. And those constant are start
  indexes into array to places where to store pointer values

REC_OFFS_HEADER_SIZE: adjusted for the new reality

REC_OFFS_NORMAL_SIZE:
  increase size from 100 to 300 which means less heap allocations.
  And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
  is smaller than previous 800 bytes.

REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality

rem0rec.h, rem0rec.ic, rem0rec.cc:
  various arguments, return values and local variables types were changed to
  fix numerous integer conversions issues.

enum field_type_t:
  offset types concept was introduces which replaces old offset flags stuff.
  Like in earlier version, 2 upper bits are used to store offset type.
  And this enum represents those types.

REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed

get_type(), set_type(), get_value(), combine():
  these are convenience functions to work with offsets and it's types

rec_offs_base()[0]:
  still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL

rec_offs_base()[i]:
  these have type offset_t now. Two upper bits contains type.
2019-12-13 00:26:50 +07:00
Marko Mäkelä
0a20e5ab77 Merge 10.2 into 10.3 2019-12-12 14:41:51 +02:00
Marko Mäkelä
d146e3dcfe MDEV-21256: Simplify ut_rnd_interval()
ut_rnd_interval(): Remove the first parameter, which was mostly
passed as 0. Implement as a simple wrapper around ut_rnd_gen().
Trivially return 0 if the size of the interval is smaller than 2.

ut_rnd_ulint_counter, ut_rnd_gen_next_ulint(), ut_rnd_gen_ulint(): Remove.
2019-12-10 16:58:28 +02:00
Marko Mäkelä
51fc8ab73e MDEV-21256: Reduce the use of ut_rnd_gen_next_ulint()
ut_rnd_set_seed(): Unused function; remove.

ut_rnd_gen(): Renamed from page_cur_lcg_prng().

ut_rnd_current: The internal state of ut_rnd_gen().

page_cur_open_on_rnd_user_rec(): Replace linear search with
page_rec_get_nth().
2019-12-10 16:58:28 +02:00
Marko Mäkelä
42a4ae54c2 MDEV-21225 Remove ut_align() and use aligned_malloc()
Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.

It is cleaner to invoke aligned_malloc() and aligned_free() directly.

ut_align(): Remove. In assertions, ut_align_down() can be used instead.
2019-12-05 06:42:31 +02:00
Marko Mäkelä
caea64df18 Cleanup: Remove some page_get_page_no() calls
Refer to buf_page_t::id instead of parsing the tablespace identifier
or page number from the buffer pool page.
2019-12-03 11:05:19 +02:00
Marko Mäkelä
56f6dab1d0 MDEV-21174: Replace mlog_write_ulint() with mtr_t::write()
mtr_t::write(): Replaces mlog_write_ulint(), mlog_write_ull().
Optimize away writes if the page contents does not change,
except when a dummy write has been explicitly requested.

Because the member function template takes a block descriptor as a
parameter, it is possible to introduce better consistency checks.
Due to this, the code for handling file-based lists, undo logs
and user transactions was refactored to pass around buf_block_t.
2019-12-03 11:05:18 +02:00
Marko Mäkelä
504823bcce MDEV-21174: Cleanup MLOG_PAGE_CREATE
page_create_write_log(), mlog_write_initial_log_record():
Merge to the only caller, and use
mlog_write_initial_log_record_low() for writing the log record.
2019-12-03 11:05:18 +02:00
Marko Mäkelä
bf2cc46798 MDEV-21133: Remove buf_frame_copy() 2019-12-03 11:05:18 +02:00
Marko Mäkelä
326515878d MDEV-21133: Alignment hints for ROW_FORMAT=COMPRESSED
page_zip_compress(), page_zip_decompress_low(), page_zip_reorganize():
Make use of memcpy_aligned() and memset_aligned(), so that some
operations can be translated more efficiently.

No difference was observed for AMD64 on GCC 9.2.1. But, such hints could
enable more efficient code on less common instruction set architectures,
such as 32-bit ARM, POWER or MIPS.
2019-11-29 14:48:47 +02:00
Marko Mäkelä
51a4260f00 MDEV-21133: Introduce memmove_aligned()
Both variants of the InnoDB page directory are aligned to the entry size
(16 bits). Inform the compiler about it.
2019-11-29 11:23:35 +02:00
Marko Mäkelä
29d67d051a Cleanup btr_page_get_prev(), btr_page_get_next()
Remove the redundant parameter mtr_t*.

Make use of page_has_prev(), page_has_next() whenever possible.
2019-11-11 13:36:21 +02:00
Marko Mäkelä
64a02e4fa2 MDEV-19586: Add const qualifiers
Except for fil_name_process(), which invokes os_normalize_path(),
the redo log record parser will not modify the redo log records.
Add const qualifiers accordingly.
2019-11-04 09:25:26 +02:00
Marko Mäkelä
0ccfdc8eff Remove InnoDB wrappers of <string.h> functions
ut_strcmp(), ut_strcpy(), ut_strlen(), ut_memcpy(), ut_memcmp(),
ut_memmove(): Remove. Invoke the standard library functions directly.
2019-10-30 07:31:39 +02:00
Marko Mäkelä
d04f2de80a Merge 10.4 into 10.5 2019-10-11 08:41:36 +03:00
Marko Mäkelä
09afd3da1a Merge 10.3 into 10.4 2019-10-10 21:30:40 +03:00
Marko Mäkelä
01f45becd1 MDEV-19783: Add more assertions
btr_page_get_split_rec_to_left(): Assert that in the leftmost leaf page,
the metadata record exists if and only if index->is_instant().

page_validate(): Correct the wording of a message.

rec_init_offsets(): Assert that whenever a record is in "instant ALTER"
format, index->is_instant() must hold.
2019-10-10 20:40:26 +03:00
Marko Mäkelä
7f84e3ad75 Merge 10.2 into 10.3 2019-10-10 20:38:44 +03:00
Marko Mäkelä
6d7a826953 MDEV-20788: Bogus assertion failure for PAGE_FREE list
In MDEV-11369 (instant ADD COLUMN) in MariaDB Server 10.3,
we introduced the hidden metadata record that must be the
first record in the clustered index if and only if
index->is_instant() holds.

To catch MDEV-19783, in
commit ed0793e096 and
commit 99dc40d6ac
we added some assertions to find cases where
the metadata record is missing while it should not be, or a
record exists when it should not. Those assertions were invalid
when traversing the PAGE_FREE list. That list can contain anything;
we must only be able to determine the successor and the size of
each garbage record in it.

page_validate(), page_simple_validate_old(), page_simple_validate_new():
Do not invoke page_rec_get_next_const() for traversing the PAGE_FREE
list, but instead use a lower-level accessor that does not attempt to
validate the REC_INFO_MIN_REC_FLAG.

page_copy_rec_list_end_no_locks(),
page_copy_rec_list_start(), page_delete_rec_list_start():
Add assertions.

btr_page_get_split_rec_to_left(): Remove a redundant return value,
and make the output parameter the return value.

btr_page_get_split_rec_to_right(), btr_page_split_and_insert(): Clean up.
2019-10-10 20:29:30 +03:00
Marko Mäkelä
c11e5cdd12 Merge 10.3 into 10.4 2019-10-10 11:19:25 +03:00
Marko Mäkelä
892378fb9d Merge 10.2 into 10.3 2019-10-09 13:25:11 +03:00
Eugene Kosov
ed0793e096 MDEV-19783: Add more REC_INFO_MIN_REC_FLAG checks
btr_cur_pessimistic_delete(): code changed in a way that allows
to put more REC_INFO_MIN_REC_FLAG assertions inside btr_set_min_rec_mark().
Without that change tests innodb.innodb-table-online,
innodb.temp_table_savepoint and innodb_zip.prefix_index_liftedlimit fail.

Removed basically duplicated page_zip_validate() calls
which fails because of temporary(!) invariant violation.
That fixed innodb_zip.wl5522_debug_zip and
innodb_zip.prefix_index_liftedlimit
2019-10-09 08:29:26 +03:00
Eugene Kosov
99dc40d6ac MDEV-19783 Random crashes and corrupt data in INSTANT-added columns
The bug affects MariaDB Server 10.3 or later, but it makes sense
to improve CHECK TABLE in earlier versions already.

page_validate(): Check REC_INFO_MIN_REC_FLAG in the records.
This allows CHECK TABLE to catch more bugs.
2019-10-09 08:29:26 +03:00
Marko Mäkelä
d480d28f4f Add page_has_prev(), page_has_next(), page_has_siblings()
Until now, InnoDB inefficiently compared the aligned fields
FIL_PAGE_PREV, FIL_PAGE_NEXT to the byte-order-agnostic value FIL_NULL.

This is a backport of 32170f8c6d
from MariaDB Server 10.3.
2019-10-09 08:29:26 +03:00
Marko Mäkelä
72f671ab7b Merge 10.4 into 10.5 2019-09-27 07:15:07 +03:00
Marko Mäkelä
1f4ee3fa5a MDEV-20117 Assertion 0 failed in row_sel_get_clust_rec_for_mysql
The crash scenario is as follows:

(1) A non-empty table exists.
(2) MDEV-15562 instant ADD/DROP/reorder has been invoked.
(3) Some purgeable undo log exists for the table.
(4) The table becomes empty, containing not even any delete-marked records,
only containing the hidden metadata record that was added in (2).
(5) An instant ADD/DROP/reorder column is executed, and the table
is emptied and the (2) metadata removed.
(6) Purge processes an undo log record from (3), which will refer to
a non-existent clustered index field, because the metadata that
was created in (2) was remoeved in (5).

We fix this by adjusting step (5) so that we will never remove the
MDEV-15562-style metadata record. Removing the MDEV-11369 metadata
record (instant ADD COLUMN to the end of the table) is completely
fine at any time when the table becomes empty, because
dict_index_t::n_fields will remain unchanged.

innobase_instant_try(): Never remove the MDEV-15562 metadata record.

page_cur_delete_rec(): Do not reset FIL_PAGE_TYPE when the
MDEV-15562 metadata record is being removed as part of
btr_cur_pessimistic_update() invoked by innobase_instant_try().
2019-09-26 19:45:10 +03:00
Marko Mäkelä
e3c39c0be8 MDEV-13564 follow-up: Remove dead code
In MariaDB 10.4.0, commit 09af00cbde
removed the crash-upgrade logic for the MariaDB 10.2
innodb_safe_truncate=OFF TRUNCATE TABLE (which was the only option
between MariaDB 10.2.2 and 10.2.18), but failed to adjust some
comments and code.

buf_page_io_complete(): Remove a bogus comment about TRUNCATE.

dict_recreate_index_tree(): Unused function; remove.

fil_space_t::stop_new_ops: Clarify the comment.

fil_space_acquire_low(): Remove a bogus comment about TRUNCATE.

fil_check_pending_ops(), fil_check_pending_io(): Adjust a warning message.
This code is only invoked as part of DISCARD TABLESPACE or DROP TABLE.
DROP TABLE is internally used as part of ALTER TABLE, OPTIMIZE TABLE,
or TRUNCATE TABLE.

RemoteDatafile::create_link_file(): Clarify a comment.

ibuf_delete_for_discarded_space(): Clarify the function comment.

dict_table_x_lock_indexes(), dict_table_x_unlock_indexes():
Merge with the only remaining caller, row_quiesce_set_state().

page_create_zip(): Remove a bogus comment about TRUNCATE.
2019-09-26 10:25:34 +03:00
Marko Mäkelä
8887effe13 MDEV-12353 preparation: page_mem_alloc_heap()
Define the function in the same compilation unit as its only callers
page_cur_insert_rec_low() and page_cur_insert_rec_zip().
2019-09-19 08:35:21 +03:00
Marko Mäkelä
71e856e152 MDEV-12353 preparation: Clean up page directory operations
page_mem_free(): Define in the same file with the only caller
page_cur_delete_rec().

page_dir_slot_set_rec(): Add const qualifier to a parameter.

page_dir_delete_slot(): Merge with the only caller page_dir_balance_slot().

page_dir_add_slot(): Merge with the only caller page_dir_split_slot().

page_dir_split_slot(), page_dir_balance_slot(): Define in the
same compilation unit with the callers, and simplify the code.
2019-09-18 10:53:31 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Marko Mäkelä
e9c1701e11 Merge 10.3 into 10.4 2019-07-25 18:42:06 +03:00