When updating a table with virtual BLOB columns, the following might
happen:
- an old record is read from the table, it has no virtual blob values
- update_virtual_fields() is run, vcol blob gets its value into the
record. But only a pointer to the value is in the table->record[0],
the value is in Field_blob::value String (but it doesn't have to be!
it can be in the record, if the column is just a copy of another
columns: ... b VARCHAR, c BLOB AS (b) ...)
- store_record(table,record[1]), old record now is in record[1]
- fill_record() prepares new values in record[0], vcol blob is updated,
new value replaces the old one in the Field_blob::value
- now both record[1] and record[0] have a pointer that points to the
*new* vcol blob value. Or record[1] has a pointer to nowhere if
Field_blob::value had to realloc.
To fix this I have introduced a new String object 'read_value' in
Field_blob. When updating virtual columns when a row has been read,
the allocated value is stored in 'read_value' instead of 'value'. The
allocated blobs for the new row is stored in 'value' as before.
I also made, as a safety precaution, the insert delayed handling of
blobs more general by using value to store strings instead of the
record. This ensures that virtual functions on delayed insert should
work in as in the case of normal insert.
Triggers are now properly updating the read, write and vcol maps for used
fields. This means that we don't need VCOL_UPDATE_FOR_READ_WRITE anymore
and there is no need for any other special handling of triggers in
update_virtual_fields().
To be able to test how many times virtual fields are invoked, I also
relaxed rules that one can use local (@) variables in DEFAULT and non
persistent virtual field expressions.
- MDEV-11621 rpl.rpl_gtid_stop_start fails sporadically in buildbot
- MDEV-11620 rpl.rpl_upgrade_master_info fails sporadically in buildbot
The issue above was probably that the build machine was overworked and the
shutdown took longer than 30 resp 10 seconds, which caused MyISAM tables
to be marked as crashed.
Fixed by flushing myisam tables before doing a forced shutdown/kill.
I also increased timeout for forced shutdown from 10 seconds to 60 seconds
to fix other possible issues on slow machines.
Fixed also some compiler warnings
- Atomic writes are enabled by default
- Automatically detect if device supports atomic write and use it if
atomic writes are enabled
- Remove ATOMIC WRITE options from CREATE TABLE
- Atomic write is a device option, not a table options as the table may
crash if the media changes
- Add support for SHANNON SSD cards
buf_flush_init_flush_rbt() was called too early in MariaDB server 10.0,
10.1, MySQL 5.5 and MySQL 5.6. The memory leak has been fixed in
the XtraDB storage engine and in MySQL 5.7.
As a result, when the server is started to initialize new data files,
the buf_pool->flush_rbt will be created unnecessarily and then leaked.
This memory leak was noticed in MariaDB server 10.1 when running the
test encryption.innodb_first_page.
Sometimes innodb_data_file_size_debug was reported as INT UNSIGNED
instead of BIGINT UNSIGNED. Make it uint instead of ulong to get
a more deterministic result.
Memory was leaked when ALTER TABLE is attempted on a table
that contains corrupted indexes.
The memory leak was reported by AddressSanitizer for the test
innodb.innodb_corrupt_bit. The leak was introduced into
MariaDB Server 10.0.26, 10.1.15, 10.2.1 by the following:
commit c081c978a2
Merge: 1d21b22155a482e76e65
Author: Sergei Golubchik <serg@mariadb.org>
Date: Tue Jun 21 14:11:02 2016 +0200
Merge branch '5.5' into bb-10.0
This makes no functional change to MariaDB Server 10.2.
XtraDB is not being built, and in InnoDB, there was no
memory leak in buf_dblwr_process() in MariaDB Server 10.2.
Most of them are trivial, except for the thread_sync_t refactoring.
We must not invoke memset() on non-POD objects.
mtflush_work_initialized: Remove. Refer to mtflush_ctx != NULL instead.
thread_sync_t::thread_sync_t(): Refactored from
buf_mtflu_handler_init().
thread_sync_t::~thread_sync_t(): Refactored from
buf_mtflu_io_thread_exit().
MariaDB Server is unnecessarily evaluating the arguments of
DBUG_PRINT() macros when the label is not defined.
The macro DBUG_LOG() for C++ operator<< output which was added for
InnoDB diagnostics in MySQL 5.7 is missing from MariaDB. Unlike the
MySQL 5.7 implementation, MariaDB will avoid allocating and
initializing the output string when the label is not defined.
Introduce DBUG_OUT("crypt") and DBUG_OUT("checksum") for some InnoDB
diagnostics, replacing some use of ib::info().
In the backport of Bug#24450908 UNDO LOG EXISTS AFTER SLOW SHUTDOWN
from MySQL 5.7 to the MySQL 5.6 based MariaDB Server 10.1, we must
use a mutex when HAVE_ATOMIC_BUILTINS is not defined.
Also, correct a function comment. In MySQL 5.6 and MariaDB Server 10.1,
also temporary InnoDB tables are redo-logged.
InnoDB shutdown failed to properly take fil_crypt_thread() into account.
The encryption threads were signalled to shut down together with other
non-critical tasks. This could be much too early in case of slow shutdown,
which could need minutes to complete the purge. Furthermore, InnoDB
failed to wait for the fil_crypt_thread() to actually exit before
proceeding to the final steps of shutdown, causing the race conditions.
Furthermore, the log_scrub_thread() was shut down way too early.
Also it should remain until the SRV_SHUTDOWN_FLUSH_PHASE.
fil_crypt_threads_end(): Remove. This would cause the threads to
be terminated way too early.
srv_buf_dump_thread_active, srv_dict_stats_thread_active,
lock_sys->timeout_thread_active, log_scrub_thread_active,
srv_monitor_active, srv_error_monitor_active: Remove a race condition
between startup and shutdown, by setting these in the startup thread
that creates threads, not in each created thread. In this way, once the
flag is cleared, it will remain cleared during shutdown.
srv_n_fil_crypt_threads_started, fil_crypt_threads_event: Declare in
global rather than static scope.
log_scrub_event, srv_log_scrub_thread_active, log_scrub_thread():
Declare in static rather than global scope. Let these be created by
log_init() and freed by log_shutdown().
rotate_thread_t::should_shutdown(): Do not shut down before the
SRV_SHUTDOWN_FLUSH_PHASE.
srv_any_background_threads_are_active(): Remove. These checks now
exist in logs_empty_and_mark_files_at_shutdown().
logs_empty_and_mark_files_at_shutdown(): Shut down the threads in
the proper order. Keep fil_crypt_thread() and log_scrub_thread() alive
until SRV_SHUTDOWN_FLUSH_PHASE, and check that they actually terminate.
Port a bug fix from MySQL 5.7, so that all undo log pages will be freed
during a slow shutdown. We cannot scrub pages that are left allocated.
commit 173e171c6fb55f064eea278c76fbb28e2b1c757b
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date: Fri Sep 9 18:01:27 2016 +0530
Bug #24450908 UNDO LOG EXISTS AFTER SLOW SHUTDOWN
Problem:
========
1) cached undo segment is not removed from rollback segment history
(RSEG_HISTORY) during slow shutdown. In other words, If the segment is
not completely free, we are failing to remove an entry from the history
list. While starting the server, we traverse all rollback segment slots
history list and make it as list of undo logs to be purged in purge
queue.
In that case, purge queue will never be empty after slow shutdown.
2) Freeing of undo log segment is linked with removing undo log header
from history.
Fix:
====
1) Have separate logic of removing the undo log header from
history list from rollback segment slots and remove it from
rollback segment history even though it is not completely free.
Reviewed-by: Debarun Banerjee <debarun.banerjee@oracle.com>
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
RB:13672
MariaDB Server 10.0.28 and 10.1.19 merged code from Percona XtraDB
that introduced support for compressed columns. Much but not all
of this code was disabled by placing #ifdef HAVE_PERCONA_COMPRESSED_COLUMNS
around it.
Among the unused but not disabled code is code to access
some new system tables related to compressed columns.
The creation of these system tables SYS_ZIP_DICT and SYS_ZIP_DICT_COLS
would cause a crash in --innodb-read-only mode when upgrading
from an earlier version to 10.0.28 or 10.1.19.
Let us remove all the dead code related to compressed columns.
Users who already upgraded to 10.0.28 and 10.1.19 will have the two
above mentioned empty tables in their InnoDB system tablespace.
Subsequent versions of MariaDB Server will completely ignore those tables.
10.1 is merged into 10.2 now. Two issues are left to fix:
(1) encryption.innochecksum test
(2) read_page0 vs page_0_crypt_read
(1) innochecksum tool did not compile after merge because
buf_page_is_corrupted uses fil_crypt_t that has been changed.
extra/CMakeLists.txt: Added fil/fil0crypt.cc as dependency
as we need to use fil_crypt_verify_checksum for encrypted pages.
innochecksum.cc: If we think page is encrypted i.e.
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION != 0 we call
fil_crypt_verify_checksum() function to compare calculated
checksum to stored checksum calculated after encryption
(this is stored on different offset i.e.
FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION + 4).
If checksum does not match we call normal buf_page_is_corrupted
to compare calculated checksum to stored checksum.
fil0crypt.cc: add #ifdef UNIV_INNOCHECKSUM to be able to compile
this file for innochecksum tool.
(2) read_page0 is not needed and thus removed.
after aborted InnoDB startup
This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.
after aborted InnoDB startup
This bug was repeatable by starting MariaDB 10.2 with an
invalid option, such as --innodb-flush-method=foo.
It is not repeatable in MariaDB 10.1 in the same way, but the
problem exists already there.
The upper limit of innodb_spin_wait_delay was ~0UL. It does not make
any sense to wait more than a few dozens of microseconds between
attempts to acquire a busy mutex.
Make the new upper limit 6000. ut_delay(6000) could correspond to
several milliseconds even today.
Problem was that log_scrub function did not take required log_sys mutex.
Background: Unused space in log blocks are padded with MLOG_DUMMY_RECORD if innodb-scrub-log
is enabled. As log files are written on circular fashion old log blocks can be reused
later for new redo-log entries. Scrubbing pads unused space in log blocks to avoid visibility
of the possible old redo-log contents.
log_scrub(): Take log_sys mutex
log_pad_current_log_block(): Increase srv_stats.n_log_scrubs if padding is done.
srv0srv.cc: Set srv_stats.n_log_scrubs to export vars innodb_scrub_log
ha_innodb.cc: Export innodb_scrub_log to global status.
The C preprocessor symbol WITH_NUMA is never defined. Instead, the symbol
HAVE_LIBNUMA is used for checking if the feature is to be used.
If cmake -DWITH_NUMA=OFF is specified, HAVE_LIBNUMA will not be defined
at compilation time even if the library is available.
If cmake -DWITH_NUMA=ON is specified but the library is not available
at configuration time, the compilation will be aborted.
This commit is for optimizing WSREP(thd) macro.
#define WSREP(thd) \
(WSREP_ON && wsrep && (thd && thd->variables.wsrep_on))
In this we can safely remove wsrep and thd. We are not removing WSREP_ON
because this will change WSREP(thd) behaviour.
Patch Credit:- Nirbhay Choubay, Sergey Vojtovich
The InnoDB source code contains quite a few references to a closed-source
hot backup tool which was originally called InnoDB Hot Backup (ibbackup)
and later incorporated in MySQL Enterprise Backup.
The open source backup tool XtraBackup uses the full database for recovery.
So, the references to UNIV_HOTBACKUP are only cluttering the source code.
The configuration parameter innodb_use_fallocate, which is mapped to
the variable srv_use_posix_fallocate, has no effect in MariaDB 10.2.2
or MariaDB 10.2.3.
Thus the configuration parameter and the variable should be removed.
fil_space_t::recv_size: New member: recovered tablespace size in pages;
0 if no size change was read from the redo log,
or if the size change was implemented.
fil_space_set_recv_size(): New function for setting space->recv_size.
innodb_data_file_size_debug: A debug parameter for setting the system
tablespace size in recovery even when the redo log does not contain
any size changes. It is hard to write a small test case that would
cause the system tablespace to be extended at the critical moment.
recv_parse_log_rec(): Note those tablespaces whose size is being changed
by the redo log, by invoking fil_space_set_recv_size().
innobase_init(): Correct an error message, and do not require a larger
innodb_buffer_pool_size when starting up with a smaller innodb_page_size.
innobase_start_or_create_for_mysql(): Allow startup with any initial
size of the ibdata1 file if the autoextend attribute is set. Require
the minimum size of fixed-size system tablespaces to be 640 pages,
not 10 megabytes. Implement innodb_data_file_size_debug.
open_or_create_data_files(): Round the system tablespace size down
to pages, not to full megabytes, (Our test truncates the system
tablespace to more than 800 pages with innodb_page_size=4k.
InnoDB should not imagine that it was truncated to 768 pages
and then overwrite good pages in the tablespace.)
fil_flush_low(): Refactored from fil_flush().
fil_space_extend_must_retry(): Refactored from
fil_extend_space_to_desired_size().
fil_mutex_enter_and_prepare_for_io(): Extend the tablespace if
fil_space_set_recv_size() was called.
The test case has been successfully run with all the
innodb_page_size values 4k, 8k, 16k, 32k, 64k.
* some of these tests run just fine with InnoDB:
-> s/have_xtradb/have_innodb/
* sys_var tests did basic tests for xtradb only variables
-> remove them, they're useless anyway (sysvar_innodb does it better)
* multi_update had innodb specific tests
-> move to multi_update_innodb.test
Problem was that for encryption we use temporary scratch area for
reading and writing tablespace pages. But if page was not really
decrypted the correct updated page was not moved to scratch area
that was then written. This can happen e.g. for page 0 as it is
newer encrypted even if encryption is enabled and as we write
the contents of old page 0 to tablespace it contained naturally
incorrect space_id that is then later noted and error message
was written. Updated page with correct space_id was lost.
If tablespace is encrypted we use additional
temporary scratch area where pages are read
for decrypting readptr == crypt_io_buffer != io_buffer.
Destination for decryption is a buffer pool block
block->frame == dst == io_buffer that is updated.
Pages that did not require decryption even when
tablespace is marked as encrypted are not copied
instead block->frame is set to src == readptr.
If tablespace was encrypted we copy updated page to
writeptr != io_buffer. This fixes above bug.
For encryption we again use temporary scratch area
writeptr != io_buffer == dst
that is then written to the tablespace
(1) For normal tables src == dst == writeptr
ut_ad(!encrypted && !page_compressed ?
src == dst && dst == writeptr + (i * size):1);
(2) For page compressed tables src == dst == writeptr
ut_ad(page_compressed && !encrypted ?
src == dst && dst == writeptr + (i * size):1);
(3) For encrypted tables src != dst != writeptr
ut_ad(encrypted ?
src != dst && dst != writeptr + (i * size):1);
Replace all exit() calls in InnoDB with abort() [possibly via ut_a()].
Calling exit() in a multi-threaded program is problematic also for
the reason that other threads could see corrupted data structures
while some data structures are being cleaned up by atexit() handlers
or similar.
In the long term, all these calls should be replaced with something
that returns an error all the way up the call stack.
fil_tablespace_iterate(): Call fil_space_destroy_crypt_data() to
invoke mutex_free() for the mutex_create() that was done in
fil_space_read_crypt_data(). Also, remember to free
iter.crypt_io_buffer.
The failure to call mutex_free() would cause sync_latch_meta_destroy()
to access freed memory on shutdown. This affected the IMPORT of
encrypted tablespaces.
fil_space_crypt_cleanup(): Call mutex_free() to pair with
fil_space_crypt_init().
fil_space_destroy_crypt_data(): Call mutex_free() to pair with
fil_space_create_crypt_data() and fil_space_read_crypt_data().
fil_crypt_threads_cleanup(): Call mutex_free() to pair with
fil_crypt_threads_init().
fil_space_free_low(): Invoke fil_space_destroy_crypt_data().
fil_close(): Invoke fil_space_crypt_cleanup(), just like
fil_init() invoked fil_space_crypt_init().
Datafile::shutdown(): Set m_crypt_info=NULL without dereferencing
the pointer. The object will be freed along with the fil_space_t
in fil_space_free_low().
Remove some unnecessary conditions (ut_free(NULL) is OK).
srv_shutdown_all_bg_threads(): Shut down the encryption threads
by calling fil_crypt_threads_end().
srv_shutdown_bg_undo_sources(): Do not prematurely call
fil_crypt_threads_end(). Many pages can still be written by
change buffer merge, rollback of incomplete transactions, and
purge, especially in slow shutdown (innodb_fast_shutdown=0).
innobase_shutdown_for_mysql(): Call fil_crypt_threads_cleanup()
also when innodb_read_only=1, because the threads will have been
created also in that case.
sync_check_close(): Re-enable the invocation of sync_latch_meta_destroy()
to free the mutex list.