Commit graph

2,888 commits

Author SHA1 Message Date
Eugene Kosov
e9de6386ad MDEV-18115 remove now unneeded constraint
log_group_max_size: is not needed because redo log do not use fil_io() now
2020-01-18 23:42:55 +08:00
Sergei Golubchik
e7558d4760 fix compilation w/o perfschema
followup for 3a3605f4b1
2020-01-16 18:13:55 +01:00
Marko Mäkelä
cc3135cf83 Fix a typo in a comment 2020-01-14 12:31:17 +02:00
Marko Mäkelä
90002480e9 MDEV-18115: Remove OS_AIO_LOG and IORequest::LOG 2020-01-10 19:08:46 +02:00
Eugene Kosov
3a3605f4b1 MDEV-21382 fix compilation without perfschema plugin 2020-01-09 16:57:15 +07:00
Marko Mäkelä
c62efb083c MDEV-21174: Remove a bogus comment 2020-01-08 18:23:55 +02:00
Marko Mäkelä
68fe5f534c Merge 10.4 into 10.5 2020-01-07 14:10:15 +02:00
Marko Mäkelä
d60dcabd0f Merge 10.3 into 10.4 2020-01-07 13:23:41 +02:00
Marko Mäkelä
eda719793a Merge 10.2 into 10.3 2020-01-07 12:14:35 +02:00
Marko Mäkelä
82187a1221 MDEV-21429 TRUNCATE and OPTIMIZE are being refused due to "row size too large"
By default (innodb_strict_mode=ON), InnoDB attempts to guarantee
at DDL time that any INSERT to the table can succeed.
MDEV-19292 recently revised the "row size too large" check in InnoDB.
The check still is somewhat inaccurate;
that should be addressed in MDEV-20194.

Note: If a table contains multiple long string columns so that each column
is part of a column prefix index, then an UPDATE that attempts to modify
all those columns at once may fail, because the undo log record might
not fit in a single undo log page (of innodb_page_size). In the worst case,
the undo log record would grow by about 3KiB of for each updated column.

The DDL-time check (since the InnoDB Plugin for MySQL 5.1) is optional
in the sense that when the maximum B-tree record size or undo log
record size would be exceeded, the DML operation will fail and the
transaction will be properly rolled back.

create_table_info_t::row_size_is_acceptable(): Add the parameter
'bool strict' so that innodb_strict_mode=ON can be overridden during
TRUNCATE, OPTIMIZE and ALTER TABLE...FORCE (when the storage format
is not changing).

create_table_info_t::create_table(): Perform a sloppy check for
TRUNCATE TABLE (create_fk=false).

prepare_inplace_alter_table_dict(): Perform a sloppy check for
simple operations.

trx_is_strict(): Remove. The function became unused in
commit 98694ab0cb (MDEV-20949).
2020-01-07 11:02:12 +02:00
Eugene Kosov
6f2e228529 MDEV-21382 use fdatasync() for redo log where appropriate
log_t::files::fdatasync(): syncs only data for every log file

os_file_flush_data()
pfs_os_file_flush_data_func(): syncs only data for a given file
2020-01-05 16:08:44 +07:00
Eugene Kosov
fd899b3bbd Lets add another intrusive double linked list!
Features:
* STL-like interface
* Fast modification: no branches on insertion or deletion
* Fast iteration: one pointer dereference and one pointer comparison
* Your class can be a part of several lists

Modeled after std::list<T> but currently has fewer methods (not complete yet)

For even more performance it's possible to customize list with templates so
it won't have size counter variable or won't NULLify unlinked node.

How existing lists differ?

No existing lists support STL-like interface.

I_List:
* slower iteration (one more branch on iteration)
* element can't be a part of two lists simultaneously

I_P_List:
* slower modification (branches, except for the fastest push_back() case)
* slower iteration (one more branch on iteration)

UT_LIST_BASE_NODE_T:
* slower modification (branches)

Three UT_LISTs were replaced: two in fil_system_t and one in dyn_buf_t.
2020-01-04 13:39:14 +07:00
Marko Mäkelä
9949ab9393 MDEV-12353 preparation: Cleanup MLOG_FILE_NAME logging
mtr_t::commit_files(): Renamed from mtr_t::commit_checkpoint().
Remove the redundant bool parameter, and instead use checkpoint_lsn=0
to indicate that no checkpoint marker should be written.
2020-01-02 11:15:04 +02:00
Eugene Kosov
562c037b48 MDEV-18115 Remove dummy tablespace for the redo log
Redo log subsystem was decoupled from tablespace subsystem. It now manages file
descriptors for redo log files by itself.

FIL_TYPE_LOG: removed, code in various places was simplified

SRV_LOG_SPACE_FIRST_ID: renamed to SRV_SPACE_ID_UPPER_BOUND
  to better match its purpose. Code in various places was simplified

fil_n_log_flushes: replaced with log_sys::flushes
fil_n_pending_log_flushes: replaced with log_sys::pending_flushes

log_t::files::files: redo log file descriptors
log_t::files::file_names: redo log file names

log_t::files::set_file_names(): set file names without opening them
log_t::files::open_files(): opens redo log files
log_t::files::read(): treats several files as one big
log_t::files::write(): treats several files as one big
log_t::files::fsync(): flushes page cache to disk
log_t::files::close_files(): closes redo log files

fil_open_log_and_system_tablespace_files(): renamed to
  fil_open_system_tablespace_files()
  and obviously it now doesn't open redo log files

global files[1000]: removed. Why it was needed at all?
2020-01-01 22:09:51 +08:00
Marko Mäkelä
3fa4a9e6be Merge 10.4 into 10.5 2019-12-30 10:29:43 +02:00
Marko Mäkelä
ffc0a08d05 Merge 10.3 into 10.4 2019-12-30 10:27:59 +02:00
Nikita Malyavin
4923604ee2 MDEV-18865 Assertion `t->first->versioned_by_id()' failed in innodb_prepare_commit_versioned
Cause:
* row_start != 0 treated as it exists. Probably, possible row permutations had not been taken in mind.

Solution:
* Checking both row_start and row_end is correct, so versioned() function is used
2019-12-29 12:16:04 +02:00
Marko Mäkelä
b36154a109 Cleanup log_rec_t 2019-12-27 21:20:03 +02:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Marko Mäkelä
4c25e75ce7 Merge 10.3 into 10.4 2019-12-27 18:20:28 +02:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Marko Mäkelä
16bce0f6fe Cleanup: Remove dict_delete_tablespace_and_datafiles()
The function was only called by innobase_drop_tablespace(),
which was removed in commit 494e4b99a4
and added in commit 2e814d4702.
2019-12-27 11:23:28 +02:00
Alexander Barkov
4c57ab34d4 Merge remote-tracking branch 'origin/10.3' into 10.4 2019-12-25 13:33:28 +04:00
Thirunarayanan Balathandayuthapani
90ba87cb9e MDEV-19176 Reduce the memory usage during recovery
- post-push to fix the compilation issue
2019-12-23 15:56:57 +05:30
Thirunarayanan Balathandayuthapani
bba59abb03 MDEV-19176 Reduce the memory usage during recovery
- Moved the recv_sys->heap memory condition inside recv_parse_log_recs().
So that, InnoDB can mark the status as STORE_NO earlier.

- InnoDB uses one third of buffer pool chunk size for reading the redo
log records. In that case, we can avoid the scenario where buffer ran
out of memory issue during recovery.
2019-12-23 15:51:02 +05:30
Marko Mäkelä
73985d8301 Merge 10.1 into 10.2 2019-12-23 07:14:51 +02:00
Eugene Kosov
496532b5c5 MDEV-20950: Fix 32-bit Windows build 2019-12-21 21:36:25 +02:00
Marko Mäkelä
8174e68895 MDEV-21371 Assertion failure in page_rec_get_next_low() during innodb_gis.rtree_compress
A debug assertion that was added in
commit ed0793e096
turns out to be too strict. In the test innodb_gis.rtree_compress,4k
the function is sometimes being invoked by purge for a
spatial index root page that is not a leaf page (PAGE_LEVEL is 1).
2019-12-20 19:21:37 +02:00
Marko Mäkelä
44be8652c5 Cleanup: Remove fil_space_get_flags()
Replace fil_space_get_flags(space) == ULINT_UNDEFINED
with the functionally equivalent fil_space_get_size(space) == 0.
2019-12-18 16:27:26 +02:00
Marko Mäkelä
fb4a897fd9 MDEV-12353 preparation: Remove UNIV_LOG_LSN_DEBUG
The debug instrumentation with the MLOG_LSN pseudo-record has not been
used for debugging for years. Let us remove this code now.
It would have to be removed as part of MDEV-12353 or MDEV-14425 anyway,
when implementing a new redo log file format.
2019-12-17 15:39:21 +02:00
Marko Mäkelä
59a088744d Remove unused mlog_catenate_ulint_compressed()
The function was only used by trx_undo_page_init_log()
(writing the MLOG_UNDO_INIT record), which was removed in
commit ccb3550221.
2019-12-16 13:40:00 +02:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
745fd4b39f MDEV-21174: Remove some mlog_write_initial_log_record_fast()
Pass buf_block_t* to more functions that write redo log.

page_zip_write_node_ptr(), page_zip_write_blob_ptr(),
page_zip_compress_write_log_no_data():
Take buf_block_t* as parameter, and do not tolerate mtr=NULL.

page_zip_compress(): Do not tolerate mtr=NULL.

page_zip_dir_insert(): Take page_cur_t* as parameter.

mlog_write_initial_log_record(): Remove. This function was unused.

RecIterator::remove(): Remove the redundant page_zip parameter.

PageConverter::m_page_zip_ptr: Remove.
2019-12-13 18:15:51 +02:00
Marko Mäkelä
2b5a269cb4 MDEV-21174: Clean up record insertion
page_cur_insert_rec_low(): Take page_cur_t* as a parameter,
and do not tolerate mtr=NULL.

page_cur_insert_rec_zip(): Do not tolerate mtr=NULL.
2019-12-13 18:15:51 +02:00
Marko Mäkelä
befde6e97e MDEV-12353 preparation: Clean up page_cur_delete_rec()
page_cur_delete_rec(): Do not tolerate mtr=NULL.

page_delete_rec(): Merge with the only caller, RecIterator::remove().

RecIterator::m_mtr: New data member: a dummy mini-transaction.
2019-12-13 18:15:35 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Eugene Kosov
a9ea0056c7 MDEV-21133: use aligned memcpy in redo log and buffer pool 2019-12-13 21:03:50 +07:00
Marko Mäkelä
3466b47b0d Merge 10.2 into 10.3 2019-12-13 10:08:57 +02:00
Eugene Kosov
f0aa073f2b MDEV-20950 Reduce size of record offsets
offset_t: this is a type which represents one record offset.
It's unsigned short int.

a lot of functions: replace ulint with offset_t

btr_pcur_restore_position_func(),
page_validate(),
row_ins_scan_sec_index_for_duplicate(),
row_upd_clust_rec_by_insert_inherit_func(),
row_vers_impl_x_locked_low(),
trx_undo_prev_version_build():
  allocate record offsets on the stack instead of waiting for rec_get_offsets()
  to allocate it from mem_heap_t. So, reducing  memory allocations.

RECORD_OFFSET, INDEX_OFFSET:
  now it's less convenient to store pointers in offset_t*
  array. One pointer occupies now several offset_t. And those constant are start
  indexes into array to places where to store pointer values

REC_OFFS_HEADER_SIZE: adjusted for the new reality

REC_OFFS_NORMAL_SIZE:
  increase size from 100 to 300 which means less heap allocations.
  And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which
  is smaller than previous 800 bytes.

REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality

rem0rec.h, rem0rec.ic, rem0rec.cc:
  various arguments, return values and local variables types were changed to
  fix numerous integer conversions issues.

enum field_type_t:
  offset types concept was introduces which replaces old offset flags stuff.
  Like in earlier version, 2 upper bits are used to store offset type.
  And this enum represents those types.

REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed

get_type(), set_type(), get_value(), combine():
  these are convenience functions to work with offsets and it's types

rec_offs_base()[0]:
  still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL

rec_offs_base()[i]:
  these have type offset_t now. Two upper bits contains type.
2019-12-13 00:26:50 +07:00
Eugene Kosov
014e125830 optimize crash recovery
recv_dblwr_t::list is used for appending to the beginning and iterating
through its elements. std::deque fits better for that purpose because
it does less allocations than std::forward_list and provides better memory
locality.
2019-12-12 22:19:41 +07:00
Marko Mäkelä
0a20e5ab77 Merge 10.2 into 10.3 2019-12-12 14:41:51 +02:00
Eugene Kosov
f4b4284650 MDEV-16678 Prefer MDL to dict_sys.latch for innodb background tasks
Use std::queue backed by std::deque instead of list because it does less
allocations which still having O(1) push and pop operations.

Also store trx_purge_rec_t directly, because its only 16 bytes and allocating
it is to wasteful. This should be faster.

purge_node_t::purge_node_t: container is already empty after creation

purge_node_t::end(): replace clearing container with assertion that it's clear
2019-12-11 23:32:50 +07:00
Marko Mäkelä
41e6a154ec MDEV-14482 - Cache line contention on ut_rnd_interval()
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse, this is cross-mutex traffic. That is, different
mutexes suffer from contention.

Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).

This is a backport of ce04790065 from
MariaDB Server 10.3.
2019-12-10 17:01:36 +02:00
Marko Mäkelä
b1f2d3a8c8 MDEV-21256: Replace the 64-bit LCG with a 32-bit Galois LFSR
We should not need anywhere near 32 bits of entropy, so we might
just limit ourselves to a 32-bit random number generator.

Also, it might be cheaper to use exclusive-or, bit shifting and
conditional jumps, instead of multiplication and addition.

We use relaxed atomic operations on the global random number generator
state in order in an attempt to silence any warnings about race conditions.
There is an obvious race condition between the load and store in
ut_rnd_gen(), but we do not think that it matters much that the
state of the random number generator could 'stutter'.

This change seems makes the 'uncompress_ops' nondeterministic
in innodb_zip.cmp_per_index after the restart. It looks like
there is an inherent race condition in the test, because the
table could be opened for InnoDB statistics recalculation
already before innodb_cmp_per_index_enabled was set. We might
end up having uncompress_ops anywhere between 0 and 9, or perhaps
even more. Let us remove that part of the test.
2019-12-10 16:59:34 +02:00
Marko Mäkelä
d146e3dcfe MDEV-21256: Simplify ut_rnd_interval()
ut_rnd_interval(): Remove the first parameter, which was mostly
passed as 0. Implement as a simple wrapper around ut_rnd_gen().
Trivially return 0 if the size of the interval is smaller than 2.

ut_rnd_ulint_counter, ut_rnd_gen_next_ulint(), ut_rnd_gen_ulint(): Remove.
2019-12-10 16:58:28 +02:00
Marko Mäkelä
51fc8ab73e MDEV-21256: Reduce the use of ut_rnd_gen_next_ulint()
ut_rnd_set_seed(): Unused function; remove.

ut_rnd_gen(): Renamed from page_cur_lcg_prng().

ut_rnd_current: The internal state of ut_rnd_gen().

page_cur_open_on_rnd_user_rec(): Replace linear search with
page_rec_get_nth().
2019-12-10 16:58:28 +02:00
Marko Mäkelä
ea37b14409 MDEV-16678 Prefer MDL to dict_sys.latch for innodb background tasks
This is joint work with Thirunarayanan Balathandayuthapani.
The MDL interface between InnoDB and the rest of the server
(in storage/innobase/dict/dict0dict.cc and in include/)
is my work, while most everything else is Thiru's.

The collection of InnoDB persistent statistics and the
defragmentation were not refactored to use MDL. They will
keep relying on lower-level interlocking with
fil_check_pending_operations().

The purge of transaction history and the background operations on
fulltext indexes will use MDL. We will revert
commit 2c4844c9e7
(MDEV-17813) because thanks to MDL, purge cannot conflict
with DDL operations anymore. For a similar reason, we will remove
the MDEV-16222 test case from gcol.innodb_virtual_debug_purge.

Purge is essentially replacing all use of the global dict_sys.latch
with MDL. Purge will skip the undo log records for tables whose names
start with #sql-ib or #sql2. Theoretically, such tables might
be renamed back to visible table names if TRUNCATE fails to
create a new table, or the final rename in ALTER TABLE...ALGORITHM=COPY
fails. In that case, purge could permanently leave some garbage
in the table. Such garbage will be tolerated; the table would not
be considered corrupted.

To avoid repeated MDL releases and acquisitions,
trx_purge_attach_undo_recs() will sort undo log records by table_id,
and purge_node_t will keep the MDL and table handle open for multiple
successive undo log records.

get_purge_table(): A new accessor, used during the purge of
history for indexed virtual columns. This interface should ideally
not exist at all.

thd_mdl_context(): Accessor of THD::mdl_context.
Wrapped in a new thd_mdl_service.

dict_get_db_name_len(): Define inline.

dict_acquire_mdl_shared(): Acquire explicit shared MDL on a table name
if needed.

dict_table_open_on_id(): Return MDL_ticket, if requested.

dict_table_close(): Release MDL ticket, if requested.

dict_fts_index_syncing(), dict_index_t::index_fts_syncing: Remove.
row_drop_table_for_mysql() no longer needs to check these, because
MDL guarantees that a fulltext index sync will not be in progress
while MDL_EXCLUSIVE is protecting a DDL operation.

dict_table_t::parse_name(): Parse the table name for acquiring MDL.

purge_node_t::undo_recs: Change the type to std::list<trx_purge_rec_t*>
(different container, and storing also roll_ptr).

purge_node_t: Add mdl_ticket, last_table_id, purge_thd, mdl_hold_recs
for acquiring MDL and for keeping the table open across multiple
undo log records.

purge_vcol_info_t, row_purge_store_vsec_cur(), row_purge_restore_vsec_cur():
Remove. We will acquire the MDL earlier.

purge_sys_t::heap: Added, for reading undo log records.

fts_sync_during_ddl(): Invoked during ALGORITHM=INPLACE operations
to ensure that fts_sync_table() will not conflict with MDL_EXCLUSIVE.
Uses fts_t::sync_message for bookkeeping.
2019-12-10 15:42:50 +02:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Marko Mäkelä
292015d486 MDEV-21254 Remove unused keywords from the InnoDB SQL parser
The InnoDB internal SQL parser, which is used for updating the InnoDB
data dictionary tables (to be removed in MDEV-11655), persistent
statistics (to be refactored in MDEV-15020) and fulltext indexes,
implements some unused keywords and built-in functions:

OUT BINARY BLOB INTEGER FLOAT SUM DISTINCT READ
COMPACT BLOCK_SIZE
TO_CHAR TO_NUMBER BINARY_TO_NUMBER REPLSTR SYSDATE PRINTF ASSERT
RND RND_STR ROW_PRINTF UNSIGNED

Also, procedures are never declared with parameters. Only one top-level
procedure is declared and invoked at a time, and parameters are being
passed via pars_info_t.
2019-12-09 12:32:04 +02:00
Marko Mäkelä
42a4ae54c2 MDEV-21225 Remove ut_align() and use aligned_malloc()
Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.

It is cleaner to invoke aligned_malloc() and aligned_free() directly.

ut_align(): Remove. In assertions, ut_align_down() can be used instead.
2019-12-05 06:42:31 +02:00