Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.
It is cleaner to invoke aligned_malloc() and aligned_free() directly.
ut_align(): Remove. In assertions, ut_align_down() can be used instead.
The function fsp_header_get_space_id() returns ulint instead of
uint32_t, only to be able to complain that the two adjacent
tablespace ID fields in the page differ. Remove the function,
and merge the check to the callers.
Also, make some more use of aligned_malloc().
trx_rseg_write_wsrep_checkpoint(): Use mtr_t::OPT, because
much of the time, the redo log writes would be redundant.
This was broken in commit 56f6dab1d0.
InnoDB startup was discovering undo tablespaces in a dirty way.
It was reading a possibly stale copy of the TRX_SYS page before
processing any redo log records.
srv_start(): Do not call buf_pool_invalidate(). Invoke
trx_rseg_get_n_undo_tablespaces() after the recovery has been initiated.
recv_recovery_from_checkpoint_start(): Assert that the buffer pool is
empty. This used to be guaranteed by the buf_pool_invalidate() call.
trx_rseg_get_n_undo_tablespaces(): Move to the calling compilation unit,
and reimplement in a simpler way.
srv_undo_tablespace_create(): Remove the constant parameter
size=SRV_UNDO_TABLESPACE_SIZE_IN_PAGES.
srv_undo_tablespace_open(): Reimplement in a cleaner way, with
more robust error handling.
srv_all_undo_tablespaces_open(): Split from srv_undo_tablespaces_init().
srv_undo_tablespaces_init(): Read all "undo001","undo002" tablespace
files directly, without consulting the TRX_SYS page via calling
trx_rseg_get_n_undo_tablespaces().
This is joint work with Thirunarayanan Balathandayuthapani.
Mixing %type and %expect declarations:
- sql_mode=ORACLE declarations look like an empty C code section
inside sql_yacc.yy, consisting of an inactive #ifdef..#endif block.
- sql_mode=DEFAULT declarations look like an empty C code section
inside sql_yacc_ora.yy, consisting of an inactive #ifdef..#endif block.
Mixing rules:
- Adding a special rule _empty to the shared rule section.
- Changing all instances of /*Empty*/ in sql_mode=DEFAULT and sql_mode=ORACLE
specific sections to _empty.
- Changing the rest of C style comments /*xxx*/ in
sql_mode=DEFAULT and sql_mode=ORACLE specific blocks to C++ style: //xxx
- Mixing sql_yacc.yy and sql_yacc_ora.yy, so
sql_mode=ORACLE specific blocks sit in a comment inside sql_yacc.yy, and
sql_mode=DEFAULT specific blocks sit in a comment inside sql_yacc_ora.yy.
In commit af5947f433
the function btr_discard_page() is invoking btr_set_min_rec_mark()
with the wrong buf_block_t* object. node_ptr is on merge_block,
not block.
btr_discard_page(): Remove the variables merge_page, page, and
always refer to block->frame or merge_block->frame instead.
Also, limit the scope of node_ptr and avoid duplicated conditions.
btr_set_min_rec_mark(): Add a template parameter, so that the
caller can specify whether the page is supposed to have a left sibling.
Otherwise, the assertion (which was introduced in the same commit)
would fail in btr_discard_page().
mtr_t::memcpy(): Replaces mlog_write_string(), mlog_log_string().
The buf_block_t is passed a parameter, so that
mlog_write_initial_log_record_low() can be used instead of
mlog_write_initial_log_record_fast().
fil_space_crypt_t::write_page0(): Remove the fil_space_t* parameter.
Passing buf_block_t helps us avoid calling
mlog_write_initial_log_record_fast() and page_get_page_no(),
and allows us to implement more debug checks, such as
that on ROW_FORMAT=COMPRESSED index pages, only the page header
may be modified by MLOG_MEMSET records.
fseg_n_reserved_pages(): Add a buf_block_t parameter.
mtr_t::write(): Replaces mlog_write_ulint(), mlog_write_ull().
Optimize away writes if the page contents does not change,
except when a dummy write has been explicitly requested.
Because the member function template takes a block descriptor as a
parameter, it is possible to introduce better consistency checks.
Due to this, the code for handling file-based lists, undo logs
and user transactions was refactored to pass around buf_block_t.
page_create_write_log(), mlog_write_initial_log_record():
Merge to the only caller, and use
mlog_write_initial_log_record_low() for writing the log record.
aio_linux::m_max_io_count: Unused data member; remove.
aiocb::m_ret_len: Declare as the more compatible type size_t.
Unfortunately, ssize_t is not available on Microsoft Visual Studio.
btr_set_min_rec_mark(): Write MLOG_1BYTE instead of
MLOG_REC_MIN_MARK or MLOG_COMP_REC_MIN_MARK.
On ROW_FORMAT=COMPRESSED pages, the minimum record flag is not stored
at all. The flag is computed for the uncompressed page by
page_zip_decompress(). Hence, nothing needs to be logged for
ROW_FORMAT=COMPRESSED tables for this operation.
To facilitate crash-upgrade and hot backup from older versions,
we will retain the code to parse and apply the old log record types
MLOG_REC_MIN_MARK and MLOG_COMP_REC_MIN_MARK.
The MLOG_FILE_WRITE_CRYPT_DATA record was completely redundant.
It can be replaced with a single MLOG_WRITE_STRING record.
To facilitate upgrade from older versions, we will retain
fil_parse_write_crypt_data().
fil_crypt_parse(): Recover fil_space_crypt_t::write_page0().
fil_space_crypt_t::write_page0(): Write everything in a single
MLOG_WRITE_STRING for easy parsing.
fil_space_crypt_t::page0_offset: Remove.
Don't do skip_setup_conds() unless all errors are checked.
Fixes following errors:
ER_PERIOD_NOT_FOUND
ER_VERS_QUERY_IN_PARTITION
ER_VERS_ENGINE_UNSUPPORTED
ER_VERS_NOT_VERSIONED
Don't do skip_setup_conds() unless all errors are checked.
Fixes following errors:
ER_PERIOD_NOT_FOUND
ER_VERS_QUERY_IN_PARTITION
ER_VERS_ENGINE_UNSUPPORTED
ER_VERS_NOT_VERSIONED
LIMIT history partitions cannot be checked by existing algorithm of
check_misplaced_rows() because working history partition is
incremented each time another one is filled. The existing algorithm
gets record and tries to decide partition id for it by
get_partition_id(). For LIMIT history it will just get first
non-filled partition.
To fix such partitions it is required to do REBUILD instead of REPAIR.
When view is merged by DT_MERGE_FOR_INSERT it is then skipped from
processing and doesn't update WHERE clause with
vers_setup_conds(). Note that view itself cannot work in
vers_setup_conds() because it doesn't have row_start, row_end
fields. Thus it is required to descend down to material TABLE_LIST
through calls of mysql_derived_prepare() and run vers_setup_conds()
from there. Luckily, all views (views of views, views of views of
views, etc.) are linked in one list through next_global pointer, so we
can skip all views of views and get straight to non-view TABLE_LIST by
checking its merge_underlying_list property for zero value (it is
assigned by DT_MERGE_FOR_INSERT for merged derived tables).
We have to do that only for UPDATE and DELETE. Other DML commands
don't use WHERE clause.
MDEV-21146 Assertion `m_lock_type == 2' in handler::ha_drop_table upon LOAD DATA
LOAD DATA does not use WHERE and the above call of vers_setup_conds()
is not needed. unit->prepare() led to wrongly locked temporary table.
"write set" for replication finally got its correct place
(mark_columns_per_binlog_row_image()). When done generally in
mark_columns_needed_for_update() it affects optimization
algorithm. used_key_is_modified, query_plan.using_io_buffer are
wrongly set and that leads to wrong prepare_for_keyread() which limits
read_set.
Turn read cache off for update and multi-update for versioned
table. no_cache is reinited on each TABLE open because it is
applicable for specific algorithms.
As a side fix vers_insert_history_row() honors vers_write setting.
Aria with row_format=fixed uses IO_CACHE of type READ_CACHE for
sequential read in update loop. When history row is inserted inside
this loop the cache misses it and fails with error.
TODO:
Currently maria_extra() does not support SEQ_READ_APPEND. Probably it
might be possible to use this type of cache.
page_zip_compress(), page_zip_decompress_low(), page_zip_reorganize():
Make use of memcpy_aligned() and memset_aligned(), so that some
operations can be translated more efficiently.
No difference was observed for AMD64 on GCC 9.2.1. But, such hints could
enable more efficient code on less common instruction set architectures,
such as 32-bit ARM, POWER or MIPS.
Patch `36a2a185fe18` introduced `wsrep_server_incoming_address()` in `10.4`.
Since AIX `/usr/include/netinet/ip.h` header defines `ip_len` as `ip_ff.ip_flen`
and `size_t const ip_len` is preprocessed as `size_t const ip_ff.ip_vhltl.ip_x.ip_xlen`,
to prevent the define from overwriting code in MariaDB,
rename the variable name to `ip_len_mdb`.
This patch is done by Tony Reix <tony.reix@atos.net>.
This patch was submitted under MCA.
Closes#1307
Before commit c0f47a4a58 (MDEV-12026)
introduced the innodb_checksum_algorithm=full_crc32 format,
it was impossible to tell if InnoDB data files contained garbage in
the FIL_PAGE_TYPE header field (and possibly other fields).
This is because before commit 3926673ce7
in MySQL 5.1.48, InnoDB would write uninitialized data to some fields,
and because there was no way to tell with which InnoDB version a data
file was created.
If fil_space_t::full_crc32() holds, the data file cannot contain
uninitialized garbage or invalid FIL_PAGE_TYPE, and thus
fil_block_check_type() should not be invoked to correct anything.