Analysis:
=========
During alter table rebuild, InnoDB fails to apply concurrent insert log.
If the insert log record is present across the blocks then apply phase
trying to access the next block without fetching it.
Fix:
====
During virtual column parsing, check whether the record is present
across the blocks before accessing the virtual column information.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 16243
Analysis:
========
During alter table rebuild, InnoDB fails to apply concurrent delete log.
Parsing and validation of merge record happens while applying the
log operation on a table. Validation goes wrong for the virtual column.
Validation assumes that virtual column information can't be the end
of the merge record end.
Fix:
====
Virtual column information in the merge record can be end of the merge
record. Virtual column information is written at the end for
row_log_table_delete().
Reviewed-by: Satya Bodapati<satya.bodapati@oracle.com>
RB: 16155
MariaDB 10.2 never contained the Oracle change
Bug#23481444 OPTIMISER CALL ROW_SEARCH_MVCC() AND READ THE
INDEX APPLIED BY UNCOMMITTED ROWS
because it was considered risky for a GA release and incomplete.
Remove the references that were added when merging MySQL 5.6.36
to MariaDB 10.0.31, 10.1.24, and 10.2.7.
Analysis:
========
(1) During TRUNCATE of file_per_table tablespace, dict_operation_lock is
released before eviction of dirty pages of a tablespace from the buffer
pool. After eviction, we try to re-acquire
dict_operation_lock (higher level latch) but we already hold lower
level latch (index->lock). This causes latch order violation
(2) Deadlock issue is present if child table is being truncated and it
holds index lock. At the same time, cascade dml happens and it took
dict_operation_lock and waiting for index lock.
Fix:
====
1) Release the indexes lock before releasing the dict operation lock.
2) Ignore the cascading dml operation on the parent table, for the
cascading foreign key, if the child table is truncated or if it is
in the process of being truncated.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
Reviewed-by: Kevin Lewis <kevin.lewis@oracle.com>
RB: 16122
Problem:
=======
Offsets allocates memory from row_heap even for deleted row
traversal during table rebuild.
Solution:
=========
Empty the row_heap even for deleted record. So that
offsets don't allocate memory everytime.
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 15694
Cherry-pick the commit from MySQL 5.7.19, and adapt the test case:
commit 45c933ac19c73a3e9c756a87ee1ba18ba1ac564c
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Tue Mar 21 10:31:43 2017 +0530
Bug #25189192 ERRORS WHEN RESTARTING MYSQL AFTER RENAME TABLE.
PROBLEM
While renaming table innodb doesn't update the InnoDB Dictionary table
INNODB_SYS_DATAFILES incase there is change in database while doing
rename table. Hence on a restart the server log shows error that it
couldnt find table with old path before rename which has actually been
renamed. So the errors would only vanish if we update the system
tablespace
FIX
Update the innodb dictionary table with new path in the case there is
not a change in the table but the database holding the table as well.
Reviewed-by: Jimmy Yang<Jimmy.Yang@oracle.com>
RB: 15751
Revert the following change, because Memcached is not present
in MariaDB Server. We had better avoid adding dead code.
commit d9bc5e03d788b958ce8c76e157239953db60adb2
Author: Aakanksha Verma <aakanksha.verma@oracle.com>
Date: Thu May 18 14:31:01 2017 +0530
Bug #24605783 MYSQL GOT SIGNAL 6 ASSERTION FAILURE
row_log_table_apply_convert_mrec(), row_log_table_apply_insert_low(),
row_log_table_apply_insert(), row_log_table_apply_update(): Remove
the trx_id parameter. The clustered index may contain DB_TRX_ID=0,
but PAGE_MAX_TRX_ID=0 is invalid for secondary indexes. Use the
current transaction ID when inserting secondary index records.
row_log_table_apply_op(): Remove the trx_id_col parameter.
row_update_for_mysql(): Remove the wrapper function and
rename the function from row_update_for_mysql_using_upd_graph().
Remove the unused parameter mysql_rec.
Following merge from 5.6.36, this merge also rejects changes that
collided with the rejection of 6ca4f693c1ce472e2b1bf7392607c2d1124b4293.
We initially rejected 6ca4f693c1ce472e2b1bf7392607c2d1124b4293 because
it was introducing a new storage engine API method.
Let InnoDB purge reset DB_TRX_ID,DB_ROLL_PTR when the history is removed.
[TODO: It appears that the resetting is not taking place as often as
it could be. We should test that a simple INSERT should eventually
cause row_purge_reset_trx_id() to be invoked unless DROP TABLE is
invoked soon enough.]
The InnoDB clustered index record system columns DB_TRX_ID,DB_ROLL_PTR
are used by multi-versioning. After the history is no longer needed, these
columns can safely be reset to 0 and 1<<55 (to indicate a fresh insert).
When a reader sees 0 in the DB_TRX_ID column, it can instantly determine
that the record is present the read view. There is no need to acquire
the transaction system mutex to check if the transaction exists, because
writes can never be conducted by a transaction whose ID is 0.
The persistent InnoDB undo log used to be split into two parts:
insert_undo and update_undo. The insert_undo log was discarded at
transaction commit or rollback, and the update_undo log was processed
by the purge subsystem. As part of this change, we will only generate
a single undo log for new transactions, and the purge subsystem will
reset the DB_TRX_ID whenever a clustered index record is touched.
That is, all persistent undo log will be preserved at transaction commit
or rollback, to be removed by purge.
The InnoDB redo log format is changed in two ways:
We remove the redo log record type MLOG_UNDO_HDR_REUSE, and
we introduce the MLOG_ZIP_WRITE_TRX_ID record for updating the
DB_TRX_ID,DB_ROLL_PTR in a ROW_FORMAT=COMPRESSED table.
This is also changing the format of persistent InnoDB data files:
undo log and clustered index leaf page records. It will still be
possible via import and export to exchange data files with earlier
versions of MariaDB. The change to clustered index leaf page records
is simple: we allow DB_TRX_ID to be 0.
When it comes to the undo log, we must be able to upgrade from earlier
MariaDB versions after a clean shutdown (no redo log to apply).
While it would be nice to perform a slow shutdown (innodb_fast_shutdown=0)
before an upgrade, to empty the undo logs, we cannot assume that this
has been done. So, separate insert_undo log may exist for recovered
uncommitted transactions. These transactions may be automatically
rolled back, or they may be in XA PREPARE state, in which case InnoDB
will preserve the transaction until an explicit XA COMMIT or XA ROLLBACK.
Upgrade has been tested by starting up MariaDB 10.2 with
./mysql-test-run --manual-gdb innodb.read_only_recovery
and then starting up this patched server with
and without --innodb-read-only.
trx_undo_ptr_t::undo: Renamed from update_undo.
trx_undo_ptr_t::old_insert: Renamed from insert_undo.
trx_rseg_t::undo_list: Renamed from update_undo_list.
trx_rseg_t::undo_cached: Merged from update_undo_cached
and insert_undo_cached.
trx_rseg_t::old_insert_list: Renamed from insert_undo_list.
row_purge_reset_trx_id(): New function to reset the columns.
This will be called for all undo processing in purge
that does not remove the clustered index record.
trx_undo_update_rec_get_update(): Allow trx_id=0 when copying the
old DB_TRX_ID of the record to the undo log.
ReadView::changes_visible(): Allow id==0. (Return true for it.
This is what speeds up the MVCC.)
row_vers_impl_x_locked_low(), row_vers_build_for_semi_consistent_read():
Implement a fast path for DB_TRX_ID=0.
Always initialize the TRX_UNDO_PAGE_TYPE to 0. Remove undo->type.
MLOG_UNDO_HDR_REUSE: Remove. This changes the redo log format!
innobase_start_or_create_for_mysql(): Set srv_undo_sources before
starting any transactions.
The parsing of the MLOG_ZIP_WRITE_TRX_ID record was successfully
tested by running the following:
./mtr --parallel=auto --mysqld=--debug=d,ib_log innodb_zip.bug56680
grep MLOG_ZIP_WRITE_TRX_ID var/*/log/mysqld.1.err
When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.
fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.
When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.
fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.
The POINT data type is being treated just like any other
geometry data type in InnoDB. The fixed-length data type
DATA_POINT had been introduced in WL#6942 based on a
misunderstanding and without appropriate review.
Because of fundamental design problems (such as a
DEFAULT POINT(0 0) value secretly introduced by InnoDB),
the code was disabled in Oracle Bug#20415831 fix.
This patch removes the dead code and definitions that were
left behind by the Oracle Bug#20415831 patch.
This is preparation for MDEV-12288, which would set DB_TRX_ID=0
when purging history. Also with that change in place, delete-marked
records must always refer to an undo log record via a nonzero
DB_TRX_ID column. (The DB_TRX_ID is only present in clustered index
leaf page records.)
btr_cur_parse_del_mark_set_clust_rec(), rec_get_trx_id():
Statically allocate the offsets
(should never use the heap). Add some debug assertions.
Replace some use of rec_get_trx_id() with row_get_rec_trx_id().
trx_undo_report_row_operation(): Add some sanity checks that are
common for all operations that produce undo log.
The field fts_token->position is not initialized in
row_merge_fts_doc_tokenize(). We cannot have that field
without changing the fulltext parser plugin ABI
(adding st_mysql_ftparser_boolean_info::position,
as it was done in MySQL 5.7 in WL#6943).
The InnoDB fulltext parser plugins "ngram" and "Mecab" that were
introduced in MySQL 5.7 do depend on that field. But the simple_parser
does not. Apparently, simple_parser is leaving the field as 0.
So, in our fix we will assume that the missing position field is 0.
While the primary purpose of innodb_force_recovery is to allow
data to be rescued from an InnoDB instance that would crash due
to some data corruption, the settings 1, 2, or 3 are relatively
safe to use and there is no need to prevent write transactions
in these modes.
The setting innodb_force_recovery=4 and above can cause database
corruption. For those modes, we already set the flag
high_level_read_only to disable modifications, except DROP TABLE.
MODIFICATIONS_NOT_ALLOWED_MSG_FORCE_RECOVERY: Remove. There is no
need to spam the error log for each refused DML operation. It suffices
to return an error to the client. There will be messages at startup
if innodb_read_only or innodb_force_recovery are preventing writes.
Comment from Codership:-
To fix the problem, we changed the certification logic in galera to treat insert
on child table row as exclusive to prevent any operation on referenced
parent table row. At the same time, update and delete on
child table row were demoted to "shared", which makes it possible to
update/delete referenced parent table row, but only in a later transaction.
This change allows somewhat more concurrency for foreign key constrained
transactions, but is still safe for correct certification end result.
When the btr_search_latch was split into an array of latches
in MySQL 5.7.8 as part of the Oracle Bug#20985298 fix, the "caching"
of the latch across storage engine API calls was removed, and
the field trx->has_search_latch would only be set during a short
time frame in the execution of row_search_mvcc(), which was
formerly called row_search_for_mysql().
This means that the column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_LATCHED will always
report 0. That column cannot be removed in MariaDB 10.2, but it
can be removed in future releases.
trx_t::has_search_latch: Remove.
trx_assert_no_search_latch(): Remove.
row_sel_try_search_shortcut_for_mysql(): Remove a redundant condition
on trx->has_search_latch (it was always true).
sync_check_iterate(): Make the parameter const.
sync_check_functor_t: Make the operator() const, and remove result()
and the virtual destructor. There is no need to have mutable state
in the functors.
sync_checker<bool>: Replaces dict_sync_check and btrsea_sync_check.
sync_check: Replaces btrsea_sync_check.
dict_sync_check: Instantiated from sync_checker.
sync_allowed_latches: Use std::find() directly on the array.
Remove the std::vector.
TrxInInnoDB::enter(), TrxInInnoDB::exit(): Remove obviously redundant
debug assertions on trx->in_depth, and use equality comparison against 0
because it could be more efficient on some architectures.
sql_sequence.read_only: Show that the sequence can be read in
both read-only and read-write mode, and that the sequence remains
accessible after a server restart.
innodb.table_flags: Adjust the test case. Due to the MDEV-12873 fix
in 10.2, the corrupted flags for table test.td would be converted,
and a tablespace flag mismatch will occur when trying to open the file.
Remove the SHARED_SPACE flag that was erroneously introduced in
MariaDB 10.2.2, and shift the SYS_TABLES.TYPE flags back to where
they were before MariaDB 10.2.2. While doing this, ensure that
tables created with affected MariaDB versions can be loaded,
and also ensure that tables created with MySQL 5.7 using the
TABLESPACE attribute cannot be loaded.
MariaDB 10.2.2 picked the SHARED_SPACE flag from MySQL 5.7,
shifting the MariaDB 10.1 flags PAGE_COMPRESSION, PAGE_COMPRESSION_LEVEL,
ATOMIC_WRITES by one bit. The SHARED_SPACE flag would always
be written as 0 by MariaDB, because MariaDB does not support
CREATE TABLESPACE or CREATE TABLE...TABLESPACE for InnoDB.
So, instead of the bits AALLLLCxxxxxxx we would have
AALLLLC0xxxxxxx if the table was created with MariaDB 10.2.2
to 10.2.6. (AA=ATOMIC_WRITES, LLLL=PAGE_COMPRESSION_LEVEL,
C=PAGE_COMPRESSED, xxxxxxx=7 bits that were not moved.)
PAGE_COMPRESSED=NO implies LLLLC=00000. That is not a problem.
If someone created a table in MariaDB 10.2.2 or 10.2.3 with
the attribute ATOMIC_WRITES=OFF (value 2; AA=10) and without
PAGE_COMPRESSED=YES or PAGE_COMPRESSION_LEVEL, the table should be
rejected. We ignore this problem, because it should be unlikely
for anyone to specify ATOMIC_WRITES=OFF, and because 10.2.2 and
10.2.2 were not mature releases. The value ATOMIC_WRITES=ON (1)
would be interpreted as ATOMIC_WRITES=OFF, but starting with
MariaDB 10.2.4 the ATOMIC_WRITES attribute is ignored.
PAGE_COMPRESSED=YES implies that PAGE_COMPRESSION_LEVEL be between
1 and 9 and that ROW_FORMAT be COMPACT or DYNAMIC. Thus, the affected
wrong bit pattern in SYS_TABLES.TYPE is of the form AALLLL10DB00001
where D signals the presence of a DATA DIRECTORY attribute and B is 1
for ROW_FORMAT=DYNAMIC and 0 for ROW_FORMAT=COMPACT. We must interpret
this bit pattern as AALLLL1DB00001 (discarding the extraneous 0 bit).
dict_sys_tables_rec_read(): Adjust the affected bit pattern when
reading the SYS_TABLES.TYPE column. In case of invalid flags,
report both SYS_TABLES.TYPE (after possible adjustment) and
SYS_TABLES.MIX_LEN.
dict_load_table_one(): Replace an unreachable condition on
!dict_tf2_is_valid() with a debug assertion. The flags will already
have been validated by dict_sys_tables_rec_read(); if that validation
fails, dict_load_table_low() will have failed.
fil_ibd_create(): Shorten an error message about a file pre-existing.
Datafile::validate_to_dd(): Clarify an error message about tablespace
flags mismatch.
ha_innobase::open(): Remove an unnecessary warning message.
dict_tf_is_valid(): Simplify and stricten the logic. Validate the
values of PAGE_COMPRESSION. Remove error log output; let the callers
handle that.
DICT_TF_BITS: Remove ATOMIC_WRITES, PAGE_ENCRYPTION, PAGE_ENCRYPTION_KEY.
The ATOMIC_WRITES is ignored once the SYS_TABLES.TYPE has been validated;
there is no need to store it in dict_table_t::flags. The PAGE_ENCRYPTION
and PAGE_ENCRYPTION_KEY are unused since MariaDB 10.1.4 (the GA release
was 10.1.8).
DICT_TF_BIT_MASK: Remove (unused).
FSP_FLAGS_MEM_ATOMIC_WRITES: Remove (the flags are never read).
row_import_read_v1(): Display an error if dict_tf_is_valid() fails.
dict_table_t::thd: Remove. This was only used by btr_root_block_get()
for reporting decryption failures, and it was only assigned by
ha_innobase::open(), and never cleared. This could mean that if a
connection is closed, the pointer would become stale, and the server
could crash while trying to report the error. It could also mean
that an error is being reported to the wrong client. It is better
to use current_thd in this case, even though it could mean that if
the code is invoked from an InnoDB background operation, there would
be no connection to which to send the error message.
Remove dict_table_t::crypt_data and dict_table_t::page_0_read.
These fields were never read.
fil_open_single_table_tablespace(): Remove the parameter "table".
The doublewrite buffer pages must fit in the first InnoDB system
tablespace data file. The checks that were added in the initial patch
(commit 112b21da37)
were at too high level and did not cover all cases.
innodb.log_data_file_size: Test all innodb_page_size combinations.
fsp_header_init(): Never return an error. Move the change buffer creation
to the only caller that needs to do it.
btr_create(): Clean up the logic. Remove the error log messages.
buf_dblwr_create(): Try to return an error on non-fatal failure.
Check that the first data file is big enough for creating the
doublewrite buffers.
buf_dblwr_process(): Check if the doublewrite buffer is available.
Display the message only if it is available.
recv_recovery_from_checkpoint_start_func(): Remove a redundant message
about FIL_PAGE_FILE_FLUSH_LSN mismatch when crash recovery has already
been initiated.
fil_report_invalid_page_access(): Simplify the message.
fseg_create_general(): Do not emit messages to the error log.
innobase_init(): Revert the changes.
trx_rseg_create(): Refactor (no functional change).
The following options will be removed:
innodb_file_format
innodb_file_format_check
innodb_file_format_max
innodb_large_prefix
They have been deprecated in MySQL 5.7.7 (and MariaDB 10.2.2) in WL#7703.
The file_format column in two INFORMATION_SCHEMA tables will be removed:
innodb_sys_tablespaces
innodb_sys_tables
Code to update the file format tag at the end of page 0:5
(TRX_SYS_PAGE in the InnoDB system tablespace) will be removed.
When initializing a new database, the bytes will remain 0.
All references to the Barracuda file format will be removed.
Some references to the Antelope file format (meaning
ROW_FORMAT=REDUNDANT or ROW_FORMAT=COMPACT) will remain.
This basically ports WL#7704 from MySQL 8.0.0 to MariaDB 10.3.1:
commit 4a69dc2a95995501ed92d59a1de74414a38540c6
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Wed Mar 11 22:19:49 2015 +0200
Problem was that all doublewrite buffer pages must fit to first
system datafile.
Ported commit 27a34df7882b1f8ed283f22bf83e8bfc523cbfde
Author: Shaohua Wang <shaohua.wang@oracle.com>
Date: Wed Aug 12 15:55:19 2015 +0800
BUG#21551464 - SEGFAULT WHILE INITIALIZING DATABASE WHEN
INNODB_DATA_FILE SIZE IS SMALL
To 10.1 (with extended error printout).
btr_create(): If ibuf header page allocation fails report error and
return FIL_NULL. Similarly if root page allocation fails return a error.
dict_build_table_def_step: If fsp_header_init fails return
error code.
fsp_header_init: returns true if header initialization succeeds
and false if not.
fseg_create_general: report error if segment or page allocation fails.
innobase_init: If first datafile is smaller than 3M and could not
contain all doublewrite buffer pages report error and fail to
initialize InnoDB plugin.
row_truncate_table_for_mysql: report error if fsp header init
fails.
srv_init_abort: New function to report database initialization errors.
srv_undo_tablespaces_init, innobase_start_or_create_for_mysql: If
database initialization fails report error and abort.
trx_rseg_create: If segment header creation fails return.
When MySQL 5.7 introduced fulltext parser plugins to InnoDB,
it hard-coded the plugin name "ngram" to mean something special.
Because -fsanitize=undefined was issuing warnings for the
assignment in row_merge_create_index() that the value is out of
range for Boolean, we remove this code that was not intended to
be used in MariaDB 10.2.
fts_check_token(): Remove the special logic for N-gram tokens.