Commit graph

317 commits

Author SHA1 Message Date
Marko Mäkelä
67e9c4cf6c Adapt the test case for Oracle Bug#25385590
buf_chunk_not_freed(), logs_empty_and_mark_files_at_shutdown():
Relax debug assertions when innodb_force_recovery=6
implies innodb_read_only.
2017-04-26 23:03:33 +03:00
Thirunarayanan Balathandayuthapani
88a84f49b3 Bug #25167032 CRASH WHEN ASSIGNING MY_ERRNO - MISSING MY_THREAD_INIT IN BACKGROUND THREAD
Description:
===========
Add my_thread_init() and my_thread_exit() for background threads which
initializes and frees the st_my_thread_var structure.

Reviewed-by: Jimmy Yang<jimmy.yang@oracle.com>
RB: 15003
2017-04-26 23:03:32 +03:00
rahul malik
dbe4c4e354 BUG#25330449 ASSERT SIZE==SPACE->SIZE DURING BUF_READ_AHEAD_RANDOM
Problem:

During read head, wrong page size is used to calcuate the tablespace size.

Fix:

Use physical page size to calculate tablespace size

Reveiwed-By: Satya Bodapati
RB: 14993
2017-04-26 23:03:32 +03:00
Darshan M N
16ed1f9c31 BUG#25053705 INVALID I/O ON TABLE AFTER TRUNCATE
Issue:
======
The issue is that if a fts index is present in a table the space size is
incorrectly calculated in the case of truncate which results in a invalid
read.

Fix:
====
Have a different space size calculation in truncate if fts indexes are
present.

RB:14755
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
2017-04-26 23:03:30 +03:00
Knut Anders Hatlen
9df0426103 Bug#25048573: STD::MAP INSTANTIATIONS CAUSE STATIC ASSERT FAILURES ON FREEBSD 11
Problem: Some instantiations of std::map have discrepancies between
the value_type of the map and the value_type of the map's allocator.
On FreeBSD 11 this is detected by Clang, and an error is raised at
compilation time.

Fix: Specify the correct value_type for the allocators.

Also fix an unused variable warning in storage/innobase/os/os0file.cc.
2017-04-26 23:03:29 +03:00
Marko Mäkelä
da76c1bd3e Minor cleanup 2017-04-26 23:03:28 +03:00
Marko Mäkelä
472b5f0d1f Follow-up to Bug#24346574 PAGE CLEANER THREAD, ASSERT BLOCK->N_POINTERS == 0
Silence the Valgrind warnings on instrumented builds (-DWITH_VALGRIND).

assert_block_ahi_empty_on_init(): A variant of
assert_block_ahi_empty() that declares n_pointers initialized and then
asserts that n_pointers==0.

In Valgrind-instrumented builds, InnoDB declares allocated memory
uninitialized.
2017-04-26 23:03:27 +03:00
Marko Mäkelä
e63ead68bf Bug#24346574 PAGE CLEANER THREAD, ASSERT BLOCK->N_POINTERS == 0
btr_search_drop_page_hash_index(): Do not return before ensuring
that block->index=NULL, even if !btr_search_enabled. We would
typically still skip acquiring the AHI latch when the AHI is
disabled, because block->index would already be NULL. Only if the AHI
is in the process of being disabled, we would wait for the AHI latch
and then notice that block->index=NULL and return.

The above bug was a regression caused in MySQL 5.7.9 by the fix of
Bug#21407023: DISABLING AHI SHOULD AVOID TAKING AHI LATCH

The rest of this patch improves diagnostics by adding assertions.

assert_block_ahi_valid(): A debug predicate for checking that
block->n_pointers!=0 implies block->index!=NULL.

assert_block_ahi_empty(): A debug predicate for checking that
block->n_pointers==0.

buf_block_init(): Instead of assigning block->n_pointers=0,
assert_block_ahi_empty(block).

buf_pool_clear_hash_index(): Clarify comments, and assign
block->n_pointers=0 before assigning block->index=NULL.
The wrong ordering could make block->n_pointers appear incorrect in
debug assertions. This bug was introduced in MySQL 5.1.52 by
Bug#13006367 62487: INNODB TAKES 3 MINUTES TO CLEAN UP THE
ADAPTIVE HASH INDEX AT SHUTDOWN

i_s_innodb_buffer_page_get_info(): Add a comment that
the IS_HASHED column in the INFORMATION_SCHEMA views
INNODB_BUFFER_POOL_PAGE and INNODB_BUFFER_PAGE_LRU may
show false positives (there may be no pointers after all.)

ha_insert_for_fold_func(), ha_delete_hash_node(),
ha_search_and_update_if_found_func(): Use atomics for
updating buf_block_t::n_pointers. While buf_block_t::index is
always protected by btr_search_x_lock(index), in
ha_insert_for_fold_func() the n_pointers-- may belong to
another dict_index_t whose btr_search_latches[] we are not holding.

RB: 13879
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
2017-04-26 23:03:27 +03:00
Marko Mäkelä
0871a00a62 MDEV-12545 Reduce the amount of fil_space_t lookups
buf_flush_write_block_low(): Acquire the tablespace reference once,
and pass it to lower-level functions. This is only a start; further
calls may be removed.

fil_decompress_page(): Remove unsafe use of fil_space_get_by_id().
2017-04-21 18:12:10 +03:00
Marko Mäkelä
47141c9d9b Fix some InnoDB type mismatch
On 64-bit Windows, sizeof(ulint)!=sizeof(ulong).
2017-04-21 18:04:02 +03:00
Marko Mäkelä
5684aa220c MDEV-12488 Remove type mismatch in InnoDB printf-like calls
Alias the InnoDB ulint and lint data types to size_t and ssize_t,
which are the standard names for the machine-word-width data types.

Correspondingly, define ULINTPF as "%zu" and introduce ULINTPFx as "%zx".
In this way, better compiler warnings for type mismatch are possible.

Furthermore, use PRIu64 for that 64-bit format, and define
the feature macro __STDC_FORMAT_MACROS to enable it on Red Hat systems.

Fix some errors in error messages, and replace some error messages
with assertions.
Most notably, an IMPORT TABLESPACE error message in InnoDB was
displaying the number of columns instead of the mismatching flags.
2017-04-21 18:03:15 +03:00
Marko Mäkelä
8423294acf Make InnoDB doublewrite buffer creation more robust.
buf_dblwr_create(): Remove a bogus check for the buffer pool size.
Theoretically, there is no problem if the doublewrite buffer is
larger than the buffer pool. It could only cause trouble on crash
recovery, and on recovery the doublewrite buffer is read to a buffer
that is allocated outside of the buffer pool. Moreover, this check
was only performed when the database was initialized for the first
time.

On a normal startup, buf_dblwr_init() would not enforce any
rule on the innodb_buffer_pool_size.

Furthermore, in case of an error, commit the mini-transaction in order
to avoid an assertion failure on shutdown. Yes, this will leave the
doublewrite buffer in a corrupted stage, but the doublewrite buffer
should only be initialized when the data files are being initialized
from the scratch in the first place.
2017-04-05 16:24:44 +03:00
Marko Mäkelä
0d34dd7cfb MDEV-11840 InnoDB: "Cannot open <ib_buffer_pool file>" should not be an error
buf_load(): When the file cannot be opened for reading, issue a note,
not an error message.
2017-04-05 10:16:11 +03:00
Marko Mäkelä
6e105d7535 Merge 10.1 into 10.2 2017-04-04 07:59:25 +03:00
Marko Mäkelä
9505c96839 MDEV-12428 SIGSEGV in buf_page_decrypt_after_read() during DDL
Also, some MDEV-11738/MDEV-11581 post-push fixes.

In MariaDB 10.1, there is no fil_space_t::is_being_truncated field,
and the predicates fil_space_t::stop_new_ops and fil_space_t::is_stopping()
are interchangeable. I requested the fil_space_t::is_stopping() to be added
in the review, but some added checks for fil_space_t::stop_new_ops were
not replaced with calls to fil_space_t::is_stopping().

buf_page_decrypt_after_read(): In this low-level I/O operation, we must
look up the tablespace if it exists, even though future I/O operations
have been blocked on it due to a pending DDL operation, such as DROP TABLE
or TRUNCATE TABLE or other table-rebuilding operations (ALTER, OPTIMIZE).
Pass a parameter to fil_space_acquire_low() telling that we are performing
a low-level I/O operation and the fil_space_t::is_stopping() status should
be ignored.
2017-04-03 22:09:28 +03:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Marko Mäkelä
97acc4a1c3 MDEV-12270 Port MySQL 8.0 Bug#21141390 REMOVE UNUSED FUNCTIONS AND CONVERT GLOBAL SYMBOLS TO STATIC
InnoDB defines some functions that are not called at all.
Other functions are called, but only from the same compilation unit.

Remove some function declarations and definitions, and add 'static'
keywords. Some symbols must be kept for separately compiled tools,
such as innochecksum.
2017-03-17 12:48:50 +02:00
Marko Mäkelä
4e1116b2c6 MDEV-12271 Port MySQL 8.0 Bug#23150562 REMOVE UNIV_MUST_NOT_INLINE AND UNIV_NONINL
Also, remove empty .ic files that were not removed by my MySQL commit.

Problem:
InnoDB used to support a compilation mode that allowed to choose
whether the function definitions in .ic files are to be inlined or not.
This stopped making sense when InnoDB moved to C++ in MySQL 5.6
(and ha_innodb.cc started to #include .ic files), and more so in
MySQL 5.7 when inline methods and functions were introduced
in .h files.

Solution:
Remove all references to UNIV_NONINL and UNIV_MUST_NOT_INLINE from
all files, assuming that the symbols are never defined.
Remove the files fut0fut.cc and ut0byte.cc which only mattered when
UNIV_NONINL was defined.
2017-03-17 12:42:07 +02:00
Marko Mäkelä
7668a79a88 MDEV-12269 Port Bug#22996442 INNODB: MAKE UNIV_DEBUG DEPEND ON DBUG_OFF
This is a partial port of my patch in MySQL 8.0.
In MySQL 8.0, all InnoDB references to DBUG_OFF were replaced
with UNIV_DEBUG. We will not do that in MariaDB.

InnoDB used two independent compile-time flags that distinguish
debug and non-debug builds, which is confusing.

Also, make ut_ad() and alias of DBUG_ASSERT().
2017-03-16 10:24:53 +02:00
Jan Lindström
50eb40a2a8 MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off

Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.

This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).

fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()

fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.

btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
    Use fil_space_acquire()/release() to access fil_space_t.

buf_page_decrypt_after_read():
    Use fil_space_get_crypt_data() because at this point
    we might not yet have read page 0.

fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().

fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.

fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.

check_table_options()
	Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*

i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
2017-03-14 16:23:10 +02:00
Marko Mäkelä
9dc10d5851 Merge 10.0 into 10.1 2017-03-13 19:17:34 +02:00
Marko Mäkelä
0094b6581d Merge 10.0 into 10.1 2017-03-10 15:16:13 +02:00
Marko Mäkelä
1b2b209519 Use correct integer format with printf-like functions. 2017-03-09 11:28:07 +02:00
Marko Mäkelä
498f4a825b Fix InnoDB/XtraDB compilation warnings on 32-bit builds. 2017-03-09 08:54:07 +02:00
Marko Mäkelä
89d80c1b0b Fix many -Wconversion warnings.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong.  Change some parameters to this type.

Use size_t in a few more places.

Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.

When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.

In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
2017-03-07 19:07:27 +02:00
Marko Mäkelä
27b9989d31 MDEV-12121 Introduce build option WITH_INNODB_AHI to disable innodb_adaptive_hash_index
The InnoDB adaptive hash index is sometimes degrading the performance of
InnoDB, and it is sometimes disabled to get more consistent performance.
We should have a compile-time option to disable the adaptive hash index.

Let us introduce two options:

OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON)
OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON)

where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS.

As part of this change, the misleadingly named function
trx_search_latch_release_if_reserved(trx) will be replaced with the macro
trx_assert_no_search_latch(trx) that will be empty unless
BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON).

We will also remove the unused column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT.
In MariaDB Server 10.1, it used to reflect the value of
trx_t::search_latch_timeout which could be adjusted during
row_search_for_mysql(). In 10.2, there is no such field.

Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT,
this is an almost non-functional change to the server when using the
default build options.

Some tests are adjusted so that they will work with both
-DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test
innodb.innodb_monitor has been renamed to innodb.monitor
in order to track MySQL 5.7, and the duplicate tests
sys_vars.innodb_monitor_* are removed.
2017-03-03 16:55:50 +02:00
Marko Mäkelä
78153cf641 MDEV-11927 InnoDB change buffer is not being merged
to tables in the system tablespace

This is a regression caused by MDEV-11585, which accidentally
changed Tablespace::is_undo_tablespace() in an incorrect way,
causing the InnoDB system tablespace to be reported as a dedicated
undo tablespace, for which the change buffer is not applicable.

Tablespace::is_undo_tablespace(): Remove. There were only 2
calls from the function buf_page_io_complete(). Replace those
calls as appropriate.

Also, merge changes to tablespace import/export tests from
MySQL 5.7, and clean up the tests a little further, allowing
them to be run with any innodb_page_size.

Remove duplicated error injection instrumentation for the
import/export tests.  In MySQL 5.7, the error injection label
buf_page_is_corrupt_failure was renamed to
buf_page_import_corrupt_failure.

fil_space_extend_must_retry(): Correct a debug assertion
(tablespaces can be extended during IMPORT), and remove a
TODO comment about compressed temporary tables that was
already addressed in MDEV-11816.

dict_build_tablespace_for_table(): Correct a comment that
no longer holds after MDEV-11816, and assert that
ROW_FORMAT=COMPRESSED can only be used in .ibd files.
2017-02-24 22:16:33 +02:00
Marko Mäkelä
51af19851a MDEV-11454 post-merge fix:
buf_dump(): Correct the printf format passed to buf_dump_status()
to match the argument types.

Revert the changes to storage/xtradb. XtraDB is not being compiled
for 10.2. The unused copy that we have in the 10.2 branch is only
getting merges from 10.1.

Disable the test sys_vars.innodb_buffer_pool_dump_pct_function
because it is unstable on buildbot.
2017-02-24 22:12:01 +02:00
Marko Mäkelä
342b48b7b1 Merge pull request #264 from grooverdan/10.2-MDEV-11454-innodb_buffer_pool_dump_pct-entire-pool
MDEV-11454: Make innodb_buffer_pool_dump_pct refer to the entire buffer pool size
2017-02-24 15:12:09 +02:00
Marko Mäkelä
3c47ed4849 Merge 10.0 into 10.1 2017-02-20 14:02:40 +02:00
Marko Mäkelä
a13a636c74 MDEV-11802 innodb.innodb_bug14676111 fails
The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
call in srv_purge_coordinator_suspend() is protected by that X-latch.

It would seem a good idea to consistently protect both os_event_set()
and os_event_reset() calls with a common mutex or rw-lock in those
cases where os_event_set() and os_event_reset() are used
like condition variables, tied to changes of shared state.

For each os_event_t, we try to document the mutex or rw-lock that is
being used. For some events, frequent calls to os_event_set() seem to
try to avoid hangs. Some events are never waited for infinitely, only
timed waits, and os_event_set() is used for early termination of these
waits.

os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
on other systems than Windows. TODO: remove this altogether and disable
innodb_use_native_aio on Windows.

os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.

log_write_flush_to_disk_low(): Invoke log_mutex_enter() at the end, to
avoid race conditions when changing the system state. (No potential
race condition existed before MySQL 5.7.)
2017-02-20 12:32:43 +02:00
Marko Mäkelä
13493078e9 MDEV-11802 innodb.innodb_bug14676111 fails
The function trx_purge_stop() was calling os_event_reset(purge_sys->event)
before calling rw_lock_x_lock(&purge_sys->latch). The os_event_set()
call in srv_purge_coordinator_suspend() is protected by that X-latch.

It would seem a good idea to consistently protect both os_event_set()
and os_event_reset() calls with a common mutex or rw-lock in those
cases where os_event_set() and os_event_reset() are used
like condition variables, tied to changes of shared state.

For each os_event_t, we try to document the mutex or rw-lock that is
being used. For some events, frequent calls to os_event_set() seem to
try to avoid hangs. Some events are never waited for infinitely, only
timed waits, and os_event_set() is used for early termination of these
waits.

os_aio_simulated_put_read_threads_to_sleep(): Define as a null macro
on other systems than Windows. TODO: remove this altogether and disable
innodb_use_native_aio on Windows.

os_aio_segment_wait_events[]: Initialize only if innodb_use_native_aio=0.
2017-02-20 12:20:52 +02:00
Sergei Golubchik
2195bb4e41 Merge branch '10.1' into 10.2 2017-02-10 17:01:45 +01:00
Jan Lindström
de9963b786 After reivew fixes. 2017-02-10 17:41:35 +02:00
Jan Lindström
41cd80fe06 After review fixes. 2017-02-10 16:05:37 +02:00
Jan Lindström
0340067608 After review fixes for MDEV-11759.
buf_page_is_checksum_valid_crc32()
buf_page_is_checksum_valid_innodb()
buf_page_is_checksum_valid_none():
	Use ULINTPF instead of %lu and %u for ib_uint32_t

fil_space_verify_crypt_checksum():
	Check that page is really empty if checksum and
	LSN are zero.

fil_space_verify_crypt_checksum():
	Correct the comment to be more agurate.

buf0buf.h:
	Remove unnecessary is_corrupt variable from
	buf_page_t structure.
2017-02-09 08:49:13 +02:00
Jan Lindström
e53dfb24be MDEV-11707: Fix incorrect memset() for structures containing
dynamic class GenericPolicy<TTASEventMutex<GenericPolicy> >'; vtable

Instead using mem_heap_alloc and memset, use mem_heap_zalloc
directly.
2017-02-06 15:40:17 +02:00
Jan Lindström
ddf2fac733 MDEV-11759: Encryption code in MariaDB 10.1/10.2 causes
compatibility problems

Pages that are encrypted contain post encryption checksum on
different location that normal checksum fields. Therefore,
we should before decryption check this checksum to avoid
unencrypting corrupted pages. After decryption we can use
traditional checksum check to detect if page is corrupted
or unencryption was done using incorrect key.

Pages that are page compressed do not contain any checksum,
here we need to fist unencrypt, decompress and finally
use tradional checksum check to detect page corruption
or that we used incorrect key in unencryption.

buf0buf.cc: buf_page_is_corrupted() mofified so that
compressed pages are skipped.

buf0buf.h, buf_block_init(), buf_page_init_low():
removed unnecessary page_encrypted, page_compressed,
stored_checksum, valculated_checksum fields from
buf_page_t

buf_page_get_gen(): use new buf_page_check_corrupt() function
to detect corrupted pages.

buf_page_check_corrupt(): If page was not yet decrypted
check if post encryption checksum still matches.
If page is not anymore encrypted, use buf_page_is_corrupted()
traditional checksum method.

If page is detected as corrupted and it is not encrypted
we print corruption message to error log.
If page is still encrypted or it was encrypted and now
corrupted, we will print message that page is
encrypted to error log.

buf_page_io_complete(): use new buf_page_check_corrupt()
function to detect corrupted pages.

buf_page_decrypt_after_read(): Verify post encryption
checksum before tring to decrypt.

fil0crypt.cc: fil_encrypt_buf() verify post encryption
checksum and ind fil_space_decrypt() return true
if we really decrypted the page.

fil_space_verify_crypt_checksum(): rewrite to use
the method used when calculating post encryption
checksum. We also check if post encryption checksum
matches that traditional checksum check does not
match.

fil0fil.ic: Add missed page type encrypted and page
compressed to fil_get_page_type_name()

Note that this change does not yet fix innochecksum tool,
that will be done in separate MDEV.

Fix test failures caused by buf page corruption injection.
2017-02-06 15:40:16 +02:00
Marko Mäkelä
81b7fe9d38 Shut down InnoDB after aborted startup.
This fixes memory leaks in tests that cause InnoDB startup to fail.

buf_pool_free_instance(): Also free buf_pool->flush_rbt, which would
normally be freed when crash recovery finishes.

fil_node_close_file(), fil_space_free_low(), fil_close_all_files():
Relax some debug assertions to tolerate !srv_was_started.

innodb_shutdown(): Renamed from innobase_shutdown_for_mysql().
Changed the return type to void. Do not assume that all subsystems
were started.

que_init(), que_close(): Remove (empty functions).

srv_init(), srv_general_init(): Remove as global functions.

srv_free(): Allow srv_sys=NULL.

srv_get_active_thread_type(): Only return SRV_PURGE if purge really
is running.

srv_shutdown_all_bg_threads(): Do not reset srv_start_state. It will
be needed by innodb_shutdown().

innobase_start_or_create_for_mysql(): Always call srv_boot() so that
innodb_shutdown() can assume that it was called. Make more subsystems
dependent on SRV_START_STATE_STAT.

srv_shutdown_bg_undo_sources(): Require SRV_START_STATE_STAT.

trx_sys_close(): Do not assume purge_sys!=NULL. Do not call
buf_dblwr_free(), because the doublewrite buffer can exist while
the transaction system does not.

logs_empty_and_mark_files_at_shutdown(): Do a faster shutdown if
!srv_was_started.

recv_sys_close(): Invoke dblwr.pages.clear() which would normally
be invoked by buf_dblwr_process().

recv_recovery_from_checkpoint_start(): Always release log_sys->mutex.

row_mysql_close(): Allow the subsystem not to exist.
2017-02-01 09:30:55 +02:00
Jan Lindström
6495806e59 MDEV-11254: innodb-use-trim has no effect in 10.2
Problem was that implementation merged from 10.1 was incompatible
with InnoDB 5.7.

buf0buf.cc: Add functions to return should we punch hole and
how big.

buf0flu.cc: Add written page to IORequest

fil0fil.cc: Remove unneeded status call and add test is
sparse files and punch hole supported by file system when
tablespace is created. Add call to get file system
block size. Used file node is added to IORequest. Added
functions to check is punch hole supported and setting
punch hole.

ha_innodb.cc: Remove unneeded status variables (trim512-32768)
and trim_op_saved. Deprecate innodb_use_trim and
set it ON by default. Add function to set innodb-use-trim
dynamically.

dberr.h: Add error code DB_IO_NO_PUNCH_HOLE
if punch hole operation fails.

fil0fil.h: Add punch_hole variable to fil_space_t and
block size to fil_node_t.

os0api.h: Header to helper functions on buf0buf.cc and
fil0fil.cc for os0file.h

os0file.h: Remove unneeded m_block_size from IORequest
and add bpage to IORequest to know actual size of
the block and m_fil_node to know tablespace file
system block size and does it support punch hole.

os0file.cc: Add function punch_hole() to IORequest
to do punch_hole operation,
get the file system block size and determine
does file system support sparse files (for punch hole).

page0size.h: remove implicit copy disable and
use this implicit copy to implement copy_from()
function.

buf0dblwr.cc, buf0flu.cc, buf0rea.cc, fil0fil.cc, fil0fil.h,
os0file.h, os0file.cc, log0log.cc, log0recv.cc:
Remove unneeded write_size parameter from fil_io
calls.

srv0mon.h, srv0srv.h, srv0mon.cc: Remove unneeded
trim512-trim32678 status variables. Removed
these from monitor tests.
2017-01-24 14:40:58 +02:00
Jan Lindström
b7b4c332c0 MDEV-11614: Syslog messages: "InnoDB: Log sequence number
at the start 759654123 and the end 0 do not match."

For page compressed and encrypted tables log sequence
number at end is not stored, thus disable this message
for them.
2017-01-22 08:46:15 +02:00
Marko Mäkelä
b05bf8ff0f Merge 10.1 to 10.2.
Most notably, this includes MDEV-11623, which includes a fix and
an upgrade procedure for the InnoDB file format incompatibility
that is present in MariaDB Server 10.1.0 through 10.1.20.

In other words, this merge should address
MDEV-11202 InnoDB 10.1 -> 10.2 migration does not work
2017-01-19 12:06:13 +02:00
Jan Lindström
dc557ca817 MDEV-11835: InnoDB: Failing assertion: free_slot != NULL on
restarting server with encryption and read-only

buf0buf.cc: Temporary slots used in encryption was calculated
by read_threads * write_threads. However, in read-only mode
write_threads is zero. Correct way is to calculate
(read_threads + write_threads) * max pending IO requests.
2017-01-19 08:19:08 +02:00
Marko Mäkelä
1eabad5dbe Remove MYSQL_COMPRESSION.
The MariaDB 10.1 page_compression is incompatible with the Oracle
implementation that was introduced in MySQL 5.7 later.

Remove the Oracle implementation. Also remove the remaining traces of
MYSQL_ENCRYPTION.

This will also remove traces of PUNCH_HOLE until it is implemented
better. The only effective call to os_file_punch_hole() was in
fil_node_create_low() to test if the operation is supported for the file.

In other words, it looks like page_compression is not working in
MariaDB 10.2, because no code equivalent to the 10.1 os_file_trim()
is enabled.
2017-01-18 08:30:42 +02:00
Marko Mäkelä
ab1e6fefd8 MDEV-11623 MariaDB 10.1 fails to start datadir created with
MariaDB 10.0/MySQL 5.6 using innodb-page-size!=16K

The storage format of FSP_SPACE_FLAGS was accidentally broken
already in MariaDB 10.1.0. This fix is bringing the format in
line with other MySQL and MariaDB release series.

Please refer to the comments that were added to fsp0fsp.h
for details.

This is an INCOMPATIBLE CHANGE that affects users of
page_compression and non-default innodb_page_size. Upgrading
to this release will correct the flags in the data files.
If you want to downgrade to earlier MariaDB 10.1.x, please refer
to the test innodb.101_compatibility how to reset the
FSP_SPACE_FLAGS in the files.

NOTE: MariaDB 10.1.0 to 10.1.20 can misinterpret
uncompressed data files with innodb_page_size=4k or 64k as
compressed innodb_page_size=16k files, and then probably fail
when trying to access the pages. See the comments in the
function fsp_flags_convert_from_101() for detailed analysis.

Move PAGE_COMPRESSION to FSP_SPACE_FLAGS bit position 16.
In this way, compressed innodb_page_size=16k tablespaces will not
be mistaken for uncompressed ones by MariaDB 10.1.0 to 10.1.20.

Derive PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR from the
dict_table_t::flags when the table is available, in
fil_space_for_table_exists_in_mem() or fil_open_single_table_tablespace().
During crash recovery, fil_load_single_table_tablespace() will use
innodb_compression_level for the PAGE_COMPRESSION_LEVEL.

FSP_FLAGS_MEM_MASK: A bitmap of the memory-only fil_space_t::flags
that are not to be written to FSP_SPACE_FLAGS. Currently, these will
include PAGE_COMPRESSION_LEVEL, ATOMIC_WRITES and DATA_DIR.

Introduce the macro FSP_FLAGS_PAGE_SSIZE(). We only support
one innodb_page_size for the whole instance.

When creating a dummy tablespace for the redo log, use
fil_space_t::flags=0. The flags are never written to the redo log files.

Remove many FSP_FLAGS_SET_ macros.

dict_tf_verify_flags(): Remove. This is basically only duplicating
the logic of dict_tf_to_fsp_flags(), used in a debug assertion.

fil_space_t::mark: Remove. This flag was not used for anything.

fil_space_for_table_exists_in_mem(): Remove the unnecessary parameter
mark_space, and add a parameter for table flags. Check that
fil_space_t::flags match the table flags, and adjust the (memory-only)
flags based on the table flags.

fil_node_open_file(): Remove some redundant or unreachable conditions,
do not use stderr for output, and avoid unnecessary server aborts.

fil_user_tablespace_restore_page(): Convert the flags, so that the
correct page_size will be used when restoring a page from the
doublewrite buffer.

fil_space_get_page_compressed(), fsp_flags_is_page_compressed(): Remove.
It suffices to have fil_space_is_page_compressed().

FSP_FLAGS_WIDTH_DATA_DIR, FSP_FLAGS_WIDTH_PAGE_COMPRESSION_LEVEL,
FSP_FLAGS_WIDTH_ATOMIC_WRITES: Remove, because these flags do not
exist in the FSP_SPACE_FLAGS but only in memory.

fsp_flags_try_adjust(): New function, to adjust the FSP_SPACE_FLAGS
in page 0. Called by fil_open_single_table_tablespace(),
fil_space_for_table_exists_in_mem(), innobase_start_or_create_for_mysql()
except if --innodb-read-only is active.

fsp_flags_is_valid(ulint): Reimplement from the scratch, with
accurate comments. Do not display any details of detected
inconsistencies, because the output could be confusing when
dealing with MariaDB 10.1.x data files.

fsp_flags_convert_from_101(ulint): Convert flags from buggy
MariaDB 10.1.x format, or return ULINT_UNDEFINED if the flags
cannot be in MariaDB 10.1.x format.

fsp_flags_match(): Check the flags when probing files.
Implemented based on fsp_flags_is_valid()
and fsp_flags_convert_from_101().

dict_check_tablespaces_and_store_max_id(): Do not access the
page after committing the mini-transaction.

IMPORT TABLESPACE fixes:

AbstractCallback::init(): Convert the flags.

FetchIndexRootPages::operator(): Check that the tablespace flags match the
table flags. Do not attempt to convert tablespace flags to table flags,
because the conversion would necessarily be lossy.

PageConverter::update_header(): Write back the correct flags.
This takes care of the flags in IMPORT TABLESPACE.
2017-01-15 19:05:50 +02:00
Marko Mäkelä
a9d00db155 MDEV-11799 InnoDB can abort if the doublewrite buffer
contains a bad and a good copy

Clean up the InnoDB doublewrite buffer code.

buf_dblwr_init_or_load_pages(): Do not add empty pages to the buffer.

buf_dblwr_process(): Do consider changes to pages that are all zero.
Do not abort when finding a corrupted copy of a page in the doublewrite
buffer, because there could be multiple copies in the doublewrite buffer,
and only one of them needs to be good.
2017-01-15 18:56:56 +02:00
Marko Mäkelä
1ba7234b21 Follow-up to MDEV-11713: Make more use of DBUG_LOG 2017-01-12 13:47:18 +02:00
Sergei Golubchik
ed008a74cf Make atomic writes general
- Atomic writes are enabled by default
- Automatically detect if device supports atomic write and use it if
  atomic writes are enabled
- Remove ATOMIC WRITE options from CREATE TABLE
  - Atomic write is a device option, not a table options as the table may
    crash if the media changes
- Add support for SHANNON SSD cards
2017-01-11 09:18:35 +02:00
Marko Mäkelä
fb5ee7d6d0 Plug a memory leak in buf_dblwr_process(). 2017-01-05 19:01:14 +02:00
Marko Mäkelä
a8ac6dc506 Fix InnoDB compilation warnings.
Most of them are trivial, except for the thread_sync_t refactoring.
We must not invoke memset() on non-POD objects.

mtflush_work_initialized: Remove. Refer to mtflush_ctx != NULL instead.

thread_sync_t::thread_sync_t(): Refactored from
buf_mtflu_handler_init().

thread_sync_t::~thread_sync_t(): Refactored from
buf_mtflu_io_thread_exit().
2017-01-05 11:49:00 +02:00