ha_innobase::commit_inplace_alter_table(): Defer the freeing of ctx->trx
until after the operation has been successfully committed. In this way,
rollback on a partitioned table will be possible.
rollback_inplace_alter_table(): Handle ctx->new_table == NULL when
ctx->trx != NULL.
If the tablespace is dropped or truncated after the
space->is_stopping() check in fil_crypt_get_page_throttle_func(),
we would proceed to request the page, and eventually report a fatal
error.
buf_page_get_gen(): Do not retry reading if mode==BUF_GET_POSSIBLY_FREED.
lock_rec_block_validate(): Be prepared for a NULL return value when
invoking buf_page_get_gen() with mode=BUF_GET_POSSIBLY_FREED.
Issue:
------
Prefix for externally stored columns were being stored in online_log when a
table is altered and alter causes table to be rebuilt. Space in online_log is
limited and if length of prefix of externally stored columns is very big, then
it is being written to online log without making sure if it fits. This leads to
memory corruption.
Fix:
----
After fix for Bug#16544143, there is no need to store prefixes of externally
stored columnd in online_log. Thus remove the code which stores column prefixes
for externally stored columns. Also, before writing anything on online_log,
make sure it fits to available memory to avoid memory corruption.
Read RB page for more details.
Reviewed-by: Annamalai Gurusami <annamalai.gurusami@oracle.com>
RB: 18239
In the merge of commit e7f4e61f6e
the call fil_flush_file_spaces(FIL_TYPE_LOG) is necessary.
Tablespaces will be flushed as part of the redo log
checkpoint, but the redo log will not necessarily
be flushed, depending on innodb_flush_method.
In commit e7f4e61f6e
the call fil_flush_file_spaces(FIL_LOG) is necessary.
Tablespaces will be flushed as part of the redo log
checkpoint, but the redo log will not necessarily
be flushed, depending on innodb_flush_method.
InnoDB takes a lot of time to perform null updates. The reason is that
even though an empty update vector was created, InnoDB will go on to
write undo log records and update the system columns
DB_TRX_ID and DB_ROLL_PTR in the clustered index, and of course write
redo log for all this.
This could have been fixed properly in
commit 54a492ecac more than 10 years ago.
Remove the local variable srv_buf_pool_size_org, which was always 0.
In MySQL 5.7, InnoDB was made a mandatory storage engine, which would
force InnoDB to start up when executing
mysqld --verbose --help
which is what mysql-test-run.pl is doing as a first step. With a
large innodb_buffer_pool_size, this would take a long time.
So, MySQL 5.7 includes a hack that starts up InnoDB with a smaller
buffer pool when the option --verbose is present.
MariaDB uses HAVE_LZO, not HAVE_LZO1X (which was never defined).
Also, the variable srv_lzo_disabled was never defined or read
(only declared and assigned to, in unreachable code).
The InnoDB system table column SYS_TABLES.MIX_LEN was repurposed
in InnoDB Plugin for MySQL 5.1, in
commit 91111174ee (MySQL 5.1.46).
Until MySQL 5.6, it only contained a flag DICT_TF2_TEMPORARY.
MySQL 5.6 introduced a number of flags that were transient
in nature. One of these was introduced in 5.6.5, originally
called DICT_TF2_USE_TABLESPACE and later renamed to
DICT_TF2_USE_FILE_PER_TABLE. MySQL 5.7.6 introduced logic
that insists that the flag be set for any table that does not
reside in a shared tablespace, breaking upgrade from MySQL 5.5.
MariaDB does not support shared tablespaces other than the
InnoDB system tablespace. Also, some dependencies on
SYS_TABLES.MIX_LEN were removed in an earlier fix:
MDEV-13084 MariaDB 10.2 crashes on corrupted SYS_TABLES.MIX_LEN field
(commit e813fe8622).
dict_check_sys_tables(): Remove a bogus debug assertion, and add a
comment that explains how DICT_TF2_USE_FILE_PER_TABLE is used.
dict_table_is_file_per_table(): Remove a bogus debug assertion.
Pool::mem_free(): Poison the freed memory. Assert that it was
fully initialized, because the reuse of trx_t objects will
assume that the objects were previously initialized.
Pool::~Pool(), Pool::get(): Unpoison the allocated memory,
and mark it initialized.
trx_free(): After invoking Pool::mem_free(), unpoison
trx_t::mutex and trx_t::undo_mutex, because MutexMonitor
will access these even for freed trx_t objects.
These test can sporadically show mutex deadlock warnings between LOCK_wsrep_thd
and LOCK_thd_data mutexes. This means that these mutexes can be locked in opposite
order by different threads, and thus result in deadlock situation.
To fix such issue, the locking policy of these mutexes should be revised and
enforced to be uniform. However, a quick code review shows that the number of
lock/unlock operations for these mutexes combined is between 100-200, and all these
mutex invocations should be checked/fixed.
On the other hand, it turns out that LOCK_wsrep_thd is used for protecting access to
wsrep variables of THD (wsrep_conflict_state, wsrep_query_state), whereas LOCK_thd_data
protects query, db and mysys_var variables in THD. Extending LOCK_thd_data to protect
also wsrep variables looks like a viable solution, as there should not be a use case
where separate threads need simultaneous access to wsrep variables and THD data variables.
In this commit LOCK_wsrep_thd mutex is refactored to be replaced by LOCK_thd_data.
By bluntly replacing LOCK_wsrep_thd by LOCK_thd_data, will result in double locking
of LOCK_thd_data, and some adjustements have been performed to fix such situations.
dict_load_table_low(): When flagging an error, assign *table = NULL.
Failure to do so could cause a crash if an error was flagged when
accessing INFORMATION_SCHEMA.INNODB_SYS_TABLES.
While the test case crashes a MariaDB 10.2 debug build only,
let us apply the fix to the earliest applicable MariaDB series (10.0)
to avoid any data corruption on a table-rebuilding ALTER TABLE
using ALGORITHM=INPLACE.
innobase_create_key_defs(): Use altered_table->s->primary_key
when a new primary key is being created.
Problem:
=======
InnoDB cleans all temporary undo logs during commit. During rollback
of secondary index entry, InnoDB tries to build the previous version
of clustered index. It leads to access of freed undo page during
previous transaction commit and it leads to undo log corruption.
Solution:
=========
During rollback, temporary undo logs should not try to build
the previous version of the record.
disable online alter add primary key for innodb, if the
table is opened/locked more than once in the current connection
(see assert in ha_innobase::add_index())
slave node killed himself.
Problem:- If we try to delete table with foreign key and table whom it is
referring with wsrep_slave_threads>1 then galera tries to execute both
Delete_rows_log-event in parallel, which should not happen.
Solution:- This is happening because we do not have foreign key info in
write set. Upto version 10.2.7 it used to work fine. Actually it happening
because of issue in commit 2f342c4. wsrep_must_process_fk has changed to
make it similar to original condition.
fil_crypt_rotate_pages
If tablespace is marked as stopping stop also page rotation
fil_crypt_flush_space
If tablespace is marked as stopping do not try to read
page 0 and write it back.
if volume can't be opened due to permissions, or
IOCTL_STORAGE_QUERY_PROPERTY fails with not implemented, do not report it.
Those errors happen, there is nothing user can do.
This patch amends fix for MDEV-12948.
Problem was that if tablespace was encrypted we try to copy
also page 0 from read buffer to write buffer that are in
that case the same memory area.
fil_iterate
When tablespace is encrypted or compressed its
first page (i.e. page 0) is not encrypted or
compressed and there is no need to copy buffer.
Problem was that if tablespace was encrypted we try to copy
also page 0 from read buffer to write buffer that are in
that case the same memory area.
fil_iterate
When tablespace is encrypted or compressed its
first page (i.e. page 0) is not encrypted or
compressed and there is no need to copy buffer.
If innodb_fast_shutdown<2, all transactions of active connections
will be rolled back on shutdown. This can take a long time, and
the systemd shutdown timeout should be extended during the wait.
logs_empty_and_mark_files_at_shutdown(): Extend the timeout when
waiting for other threads to complete.
This reverts commit 76ec37f522.
This behaviour change will be done separately in:
MDEV-15832 With innodb_fast_shutdown=3, skip the rollback
of connected transactions
In async IO completion code, after reading a page,Innodb can wait for
completion of other bufferpool reads.
This is for example what happens if change-buffering is active.
Innodb on Windows could deadlock, as it did not have dedicated threads
for processing change buffer asynchronous reads.
The fix for that is to have windows now has the same background threads,
including dedicated thread for ibuf, and log AIOs.
The ibuf/read completions are now dispatched to their threads with
PostQueuedCompletionStatus(), the write and log completions are processed
in thread where they arrive.
fil_crypt_read_crypt_data(): Do not attempt to read the tablespace
if the file is unaccessible due to a pending DDL operation, such as
renaming the file or DROP TABLE or TRUNCATE TABLE. This is only
reducing the probability of the race condition, not completely
preventing it.
Problem was that key rotation from encrypted to unecrypted was skipped
when encryption is disabled (i.e. set global innodb-encrypt-tables=OFF).
fil_crypt_needs_rotation
If encryption is disabled (i.e. innodb-encrypt-tables=off)
and there is tablespaces using default encryption (e.g.
system tablespace) that are still encrypted state we need
to rotate them from encrypted state to unencrypted state.
buf_flush_remove(): Disable the output for now, because we
certainly do not want this after every page flush on shutdown.
It must be rate-limited somehow. There already is a timeout
extension for waiting the page cleaner to exit in
logs_empty_and_mark_files_at_shutdown().
log_write_up_to(): Use correct format.
srv_purge_should_exit(): Move the timeout extension to the
appropriate place, from one of the callers.
Use systemd EXTEND_TIMEOUT_USEC to advise systemd of progress
Move towards progress measures rather than pure time based measures.
Progress reporting at numberious shutdown/startup locations incuding:
* For innodb_fast_shutdown=0 trx_roll_must_shutdown() for rolling back incomplete transactions.
* For merging the change buffer (in srv_shutdown(bool ibuf_merge))
* For purging history, srv_do_purge
Thanks Marko for feedback and suggestions.
Suggested by Marko on github pr #576
buf_all_freed only needs to be called once, not 3 times.
buf_all_freed will always return TRUE if it returns.
It will crash if any page was not flushed so its effectively
an assert anyway.
The following calls are likely redundant and could be removed:
fil_flush_file_spaces(FIL_TYPE_TABLESPACE);
fil_flush_file_spaces(FIL_TYPE_LOG);