Commit graph

112 commits

Author SHA1 Message Date
Eugene Kosov
22f2b39c14 fix data races in rwlock 2017-12-19 13:03:04 +04:00
Marko Mäkelä
7cb3520c06 Merge bb-10.2-ext into 10.3 2017-11-30 08:16:37 +02:00
Marko Mäkelä
c19ef508b8 InnoDB: Remove ut_snprintf() and the use of my_snprintf(); use snprintf() 2017-11-13 02:11:48 +02:00
Marko Mäkelä
a9faad5f78 Remove an unused variable 2017-11-02 12:07:44 +02:00
Alexander Barkov
835cbbcc7b Merge remote-tracking branch 'origin/bb-10.2-ext' into 10.3
TODO: enable MDEV-13049 optimization for 10.3
2017-10-30 20:47:39 +04:00
Marko Mäkelä
24bffb080b MDEV-13918 Race condition between INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS and ALTER/DROP/TRUNCATE TABLE
MDEV-14021 Failing assertion: table->get_ref_count() == 0 on ALTER TABLE with implicit ALGORITHM=INPLACE

Post-merge fix: Remove the now-useless use of table reference count.
The dict_operation_lock S-latch whose use was introduced in the 10.0
fix will prevent the table from disappearing due to concurrent
DDL operations (DROP, table-rebuilding ALTER, TRUNCATE).
2017-10-24 23:19:54 +03:00
Sergei Golubchik
e0a1c745ec Merge branch '10.1' into 10.2 2017-10-24 14:53:18 +02:00
Sergei Golubchik
9d2e2d7533 Merge branch '10.0' into 10.1 2017-10-22 13:03:41 +02:00
Marko Mäkelä
2eb3c5e542 MDEV-13918 Race condition between INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS and ALTER/DROP/TRUNCATE TABLE
i_s_sys_tables_fill_table_stats(): Acquire dict_operation_lock
S-latch before acquiring dict_sys->mutex, to prevent the table
from being removed from the data dictionary cache and from
being freed while i_s_dict_fill_sys_tablestats() is accessing
the table handle.
2017-10-18 21:19:33 +03:00
Marko Mäkelä
d11af09865 MDEV-14076 InnoDB: Failing assertion when accessing INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES upon upgrade from 10.1.0 to 10.1.20
i_s_dict_fill_sys_tablespaces(): Adjust the tablespace flags if needed.
2017-10-16 20:54:07 +03:00
Marko Mäkelä
a4948dafcd MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.

This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.

ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.

Some usability limitations will be addressed in subsequent work:

MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE

The format of the clustered index (PRIMARY KEY) is changed as follows:

(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.

(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.

(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.

(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)

We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.

This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.

The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.

Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.

When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.

UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.

len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.

dict_col_t::def_val: The 'default row' value of the column.  If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.

dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().

dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.

dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().

dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.

dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).

dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.

dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().

dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.

dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.

dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().

dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.

dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.

prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.

dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.

dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).

dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().

pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().

row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().

create_index_dict(): Replaces row_merge_create_index_graph().

innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.

btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.

dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.

dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.

dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().

row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.

commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)

PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.

page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.

page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.

page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.

page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.

rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.

rec_offs_make_valid(): Add the parameter 'leaf'.

rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.

dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().

rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.

cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.

trx_t::in_rollback: Make the field available in non-debug builds.

trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.

btr_index_rec_validate(): Account for instant ADD COLUMN

row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.

row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.

dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.

btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.

row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().

rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.

rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.

btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.

btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.

row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.

rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.

REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.

rec_comp_status_t: An enum of the status bit values.

rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 09:50:10 +03:00
Marko Mäkelä
4e1fa7f63d Merge bb-10.2-ext into 10.3 2017-09-01 11:33:45 +03:00
Jan Lindström
eca238aea7 MDEV-13557: Startup failure, unable to decrypt ibdata1
Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.

Fixes also MDEV-13093: Leak of Datafile::m_crypt_info on
shutdown after failed startup.

Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.

Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.

xb_load_single_table_tablespace(): Do not call
fil_space_destroy_crypt_data() any more, because Datafile::m_crypt_data
has been removed.

fil_crypt_realloc_iops(): Avoid divide by zero.

fil_crypt_set_thread_cnt(): Set fil_crypt_threads_event if
encryption threads exist. This is required to find tablespaces
requiring key rotation if no other changes happen.

fil_crypt_find_space_to_rotate(): Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.

fil_ibd_open(), fil_ibd_load(): Load possible crypt_data from first
page.

class Datafile, class SysTablespace : remove m_crypt_info field.

Datafile::get_first_page(): Return a pointer to first page buffer.

fsp_header_init(): Write encryption metadata to page 0 only if
tablespace is encrypted or encryption is disabled by table option.

i_s_dict_fill_tablespaces_encryption(): Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.
2017-08-31 08:36:56 +03:00
Jan Lindström
352d27ce36 MDEV-13557: Startup failure, unable to decrypt ibdata1
Fixes also MDEV-13488: InnoDB writes CRYPT_INFO even though
encryption is not enabled.

Problem was that we created encryption metadata (crypt_data) for
system tablespace even when no encryption was enabled and too early.
System tablespace can be encrypted only using key rotation.

Test innodb-key-rotation-disable, innodb_encryption, innodb_lotoftables
require adjustment because INFORMATION_SCHEMA INNODB_TABLESPACES_ENCRYPTION
contain row only if tablespace really has encryption metadata.

fil_crypt_set_thread_cnt: Send message to background encryption threads
if they exits when they are ready. This is required to find tablespaces
requiring key rotation if no other changes happen.

fil_crypt_find_space_to_rotate: Decrease the amount of time waiting
when nothing happens to better enable key rotation on startup.

fsp_header_init: Write encryption metadata to page 0 only if tablespace is
encrypted or encryption is disabled by table option.

i_s_dict_fill_tablespaces_encryption : Skip tablespaces that do not
contain encryption metadata. This is required to avoid too early
wait condition trigger in encrypted -> unencrypted state transfer.

open_or_create_data_files: Do not create encryption metadata
by default to system tablespace.
2017-08-29 14:23:34 +03:00
Marko Mäkelä
99e017d099 Remove INFORMATION_SCHEMA.INNODB_TRX.trx_adaptive_hash_latched
In MariaDB 10.2, this column is practically always reported as 0,
even before the data field trx_t::has_search_latch was removed.
2017-06-20 09:39:13 +03:00
Marko Mäkelä
1e3886ae80 Merge bb-10.2-ext into 10.3 2017-06-19 17:28:08 +03:00
Marko Mäkelä
50faeda4d6 Remove trx_t::has_search_latch and simplify debug code
When the btr_search_latch was split into an array of latches
in MySQL 5.7.8 as part of the Oracle Bug#20985298 fix, the "caching"
of the latch across storage engine API calls was removed, and
the field trx->has_search_latch would only be set during a short
time frame in the execution of row_search_mvcc(), which was
formerly called row_search_for_mysql().

This means that the column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_LATCHED will always
report 0. That column cannot be removed in MariaDB 10.2, but it
can be removed in future releases.

trx_t::has_search_latch: Remove.

trx_assert_no_search_latch(): Remove.

row_sel_try_search_shortcut_for_mysql(): Remove a redundant condition
on trx->has_search_latch (it was always true).

sync_check_iterate(): Make the parameter const.

sync_check_functor_t: Make the operator() const, and remove result()
and the virtual destructor. There is no need to have mutable state
in the functors.

sync_checker<bool>: Replaces dict_sync_check and btrsea_sync_check.

sync_check: Replaces btrsea_sync_check.

dict_sync_check: Instantiated from sync_checker.

sync_allowed_latches: Use std::find() directly on the array.
Remove the std::vector.

TrxInInnoDB::enter(), TrxInInnoDB::exit(): Remove obviously redundant
debug assertions on trx->in_depth, and use equality comparison against 0
because it could be more efficient on some architectures.
2017-06-16 13:17:05 +03:00
Marko Mäkelä
0c92794db3 Remove deprecated InnoDB file format parameters
The following options will be removed:

innodb_file_format
innodb_file_format_check
innodb_file_format_max
innodb_large_prefix

They have been deprecated in MySQL 5.7.7 (and MariaDB 10.2.2) in WL#7703.

The file_format column in two INFORMATION_SCHEMA tables will be removed:

innodb_sys_tablespaces
innodb_sys_tables

Code to update the file format tag at the end of page 0:5
(TRX_SYS_PAGE in the InnoDB system tablespace) will be removed.
When initializing a new database, the bytes will remain 0.

All references to the Barracuda file format will be removed.
Some references to the Antelope file format (meaning
ROW_FORMAT=REDUNDANT or ROW_FORMAT=COMPACT) will remain.

This basically ports WL#7704 from MySQL 8.0.0 to MariaDB 10.3.1:

commit 4a69dc2a95995501ed92d59a1de74414a38540c6
Author: Marko Mäkelä <marko.makela@oracle.com>
Date:   Wed Mar 11 22:19:49 2015 +0200
2017-06-02 09:36:14 +03:00
Marko Mäkelä
70505dd45b Merge 10.1 into 10.2 2017-05-22 09:46:51 +03:00
Marko Mäkelä
6f54d04eb9 Remove unnecessary conversions from integer to double 2017-05-19 22:58:59 +03:00
Marko Mäkelä
13a350ac29 Merge 10.0 into 10.1 2017-05-19 12:29:37 +03:00
Marko Mäkelä
217b8115c8 MDEV-12188 information schema - errors populating fail to free memory, unlock mutexes
Given the OK macro used in innodb does a DBUG_RETURN(1) on expression failure
the innodb implementation has a number of errors in i_s.cc.

We introduce a new macro BREAK_IF that replaces some use of the OK macro.
Also, do some other cleanup detailed below.

When invoking Field::store() on integers, always pass the parameter
is_unsigned=true to avoid an unnecessary conversion to double.

i_s_fts_deleted_generic_fill(), i_s_fts_config_fill():
Use the BREAK_IF macro instead of OK.

i_s_fts_index_cache_fill_one_index(), i_s_fts_index_table_fill_one_index():
Add a parameter for conv_string, and let the caller allocate that buffer.

i_s_fts_index_cache_fill(): Check the return status of
i_s_fts_index_cache_fill_one_index().

i_s_fts_index_table_fill(): Check the return status of
i_s_fts_index_table_fill_one_index().

i_s_fts_index_table_fill_one_fetch(): Always let the caller invoke
i_s_fts_index_table_free_one_fetch().

i_s_innodb_buffer_page_fill(), i_s_innodb_buf_page_lru_fill():
Do release dict_sys->mutex if filling the buffers fails.

i_s_innodb_buf_page_lru_fill(): Also display the value
INFORMATION_SCHEMA.INNODB_BUFFER_PAGE.PAGE_IO_FIX='IO_PIN'
when a block is in that state. Remove the unnecessary variable 'heap'.
2017-05-15 18:02:16 +03:00
Marko Mäkelä
021d636551 Fix some integer type mismatch.
Use uint32_t for the encryption key_id.

When filling unsigned integer values into INFORMATION_SCHEMA tables,
use the method Field::store(longlong, bool unsigned)
instead of using Field::store(double).

Fix also some miscellanous type mismatch related to ulint (size_t).
2017-05-10 12:45:46 +03:00
Marko Mäkelä
e63ead68bf Bug#24346574 PAGE CLEANER THREAD, ASSERT BLOCK->N_POINTERS == 0
btr_search_drop_page_hash_index(): Do not return before ensuring
that block->index=NULL, even if !btr_search_enabled. We would
typically still skip acquiring the AHI latch when the AHI is
disabled, because block->index would already be NULL. Only if the AHI
is in the process of being disabled, we would wait for the AHI latch
and then notice that block->index=NULL and return.

The above bug was a regression caused in MySQL 5.7.9 by the fix of
Bug#21407023: DISABLING AHI SHOULD AVOID TAKING AHI LATCH

The rest of this patch improves diagnostics by adding assertions.

assert_block_ahi_valid(): A debug predicate for checking that
block->n_pointers!=0 implies block->index!=NULL.

assert_block_ahi_empty(): A debug predicate for checking that
block->n_pointers==0.

buf_block_init(): Instead of assigning block->n_pointers=0,
assert_block_ahi_empty(block).

buf_pool_clear_hash_index(): Clarify comments, and assign
block->n_pointers=0 before assigning block->index=NULL.
The wrong ordering could make block->n_pointers appear incorrect in
debug assertions. This bug was introduced in MySQL 5.1.52 by
Bug#13006367 62487: INNODB TAKES 3 MINUTES TO CLEAN UP THE
ADAPTIVE HASH INDEX AT SHUTDOWN

i_s_innodb_buffer_page_get_info(): Add a comment that
the IS_HASHED column in the INFORMATION_SCHEMA views
INNODB_BUFFER_POOL_PAGE and INNODB_BUFFER_PAGE_LRU may
show false positives (there may be no pointers after all.)

ha_insert_for_fold_func(), ha_delete_hash_node(),
ha_search_and_update_if_found_func(): Use atomics for
updating buf_block_t::n_pointers. While buf_block_t::index is
always protected by btr_search_x_lock(index), in
ha_insert_for_fold_func() the n_pointers-- may belong to
another dict_index_t whose btr_search_latches[] we are not holding.

RB: 13879
Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
2017-04-26 23:03:27 +03:00
Shaohua Wang
9ce1ea6f51 BUG#24009272 SEGFAULT WITH CREATE+SELECT FROM IS+DROP FTS TABLE CONCURRENTLY
Analysis:
When we access fts_internal_tbl_name in i_s_fts_config_fill (),
it can be set to NULL by another session.

Solution:
Define fts_internal_tbl_name2 for global variable innodb_ft_aux_table,
if it's NULL, set fts_internal_tbl_name to "default".

Reviewed-by: Jimmy Yang <jimmy.yang@oracle.com>
RB: 13401
2017-04-24 14:53:22 +03:00
Sergei Golubchik
da4d71d10d Merge branch '10.1' into 10.2 2017-03-30 12:48:42 +02:00
Jan Lindström
50eb40a2a8 MDEV-11738: Mariadb uses 100% of several of my 8 cpus doing nothing
MDEV-11581: Mariadb starts InnoDB encryption threads
when key has not changed or data scrubbing turned off

Background: Key rotation is based on background threads
(innodb-encryption-threads) periodically going through
all tablespaces on fil_system. For each tablespace
current used key version is compared to max key age
(innodb-encryption-rotate-key-age). This process
naturally takes CPU. Similarly, in same time need for
scrubbing is investigated. Currently, key rotation
is fully supported on Amazon AWS key management plugin
only but InnoDB does not have knowledge what key
management plugin is used.

This patch re-purposes innodb-encryption-rotate-key-age=0
to disable key rotation and background data scrubbing.
All new tables are added to special list for key rotation
and key rotation is based on sending a event to
background encryption threads instead of using periodic
checking (i.e. timeout).

fil0fil.cc: Added functions fil_space_acquire_low()
to acquire a tablespace when it could be dropped concurrently.
This function is used from fil_space_acquire() or
fil_space_acquire_silent() that will not print
any messages if we try to acquire space that does not exist.
fil_space_release() to release a acquired tablespace.
fil_space_next() to iterate tablespaces in fil_system
using fil_space_acquire() and fil_space_release().
Similarly, fil_space_keyrotation_next() to iterate new
list fil_system->rotation_list where new tables.
are added if key rotation is disabled.
Removed unnecessary functions fil_get_first_space_safe()
fil_get_next_space_safe()

fil_node_open_file(): After page 0 is read read also
crypt_info if it is not yet read.

btr_scrub_lock_dict_func()
buf_page_check_corrupt()
buf_page_encrypt_before_write()
buf_merge_or_delete_for_page()
lock_print_info_all_transactions()
row_fts_psort_info_init()
row_truncate_table_for_mysql()
row_drop_table_for_mysql()
    Use fil_space_acquire()/release() to access fil_space_t.

buf_page_decrypt_after_read():
    Use fil_space_get_crypt_data() because at this point
    we might not yet have read page 0.

fil0crypt.cc/fil0fil.h: Lot of changes. Pass fil_space_t* directly
to functions needing it and store fil_space_t* to rotation state.
Use fil_space_acquire()/release() when iterating tablespaces
and removed unnecessary is_closing from fil_crypt_t. Use
fil_space_t::is_stopping() to detect when access to
tablespace should be stopped. Removed unnecessary
fil_space_get_crypt_data().

fil_space_create(): Inform key rotation that there could
be something to do if key rotation is disabled and new
table with encryption enabled is created.
Remove unnecessary functions fil_get_first_space_safe()
and fil_get_next_space_safe(). fil_space_acquire()
and fil_space_release() are used instead. Moved
fil_space_get_crypt_data() and fil_space_set_crypt_data()
to fil0crypt.cc.

fsp_header_init(): Acquire fil_space_t*, write crypt_data
and release space.

check_table_options()
	Renamed FIL_SPACE_ENCRYPTION_* TO FIL_ENCRYPTION_*

i_s.cc: Added ROTATING_OR_FLUSHING field to
information_schema.innodb_tablespace_encryption
to show current status of key rotation.
2017-03-14 16:23:10 +02:00
Marko Mäkelä
89d80c1b0b Fix many -Wconversion warnings.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong.  Change some parameters to this type.

Use size_t in a few more places.

Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.

When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.

In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
2017-03-07 19:07:27 +02:00
Marko Mäkelä
27b9989d31 MDEV-12121 Introduce build option WITH_INNODB_AHI to disable innodb_adaptive_hash_index
The InnoDB adaptive hash index is sometimes degrading the performance of
InnoDB, and it is sometimes disabled to get more consistent performance.
We should have a compile-time option to disable the adaptive hash index.

Let us introduce two options:

OPTION(WITH_INNODB_AHI "Include innodb_adaptive_hash_index" ON)
OPTION(WITH_INNODB_ROOT_GUESS "Cache index root block descriptors" ON)

where WITH_INNODB_AHI always implies WITH_INNODB_ROOT_GUESS.

As part of this change, the misleadingly named function
trx_search_latch_release_if_reserved(trx) will be replaced with the macro
trx_assert_no_search_latch(trx) that will be empty unless
BTR_CUR_HASH_ADAPT is defined (cmake -DWITH_INNODB_AHI=ON).

We will also remove the unused column
INFORMATION_SCHEMA.INNODB_TRX.TRX_ADAPTIVE_HASH_TIMEOUT.
In MariaDB Server 10.1, it used to reflect the value of
trx_t::search_latch_timeout which could be adjusted during
row_search_for_mysql(). In 10.2, there is no such field.

Other than the removal of the unused column TRX_ADAPTIVE_HASH_TIMEOUT,
this is an almost non-functional change to the server when using the
default build options.

Some tests are adjusted so that they will work with both
-DWITH_INNODB_AHI=ON and -DWITH_INNODB_AHI=OFF. The test
innodb.innodb_monitor has been renamed to innodb.monitor
in order to track MySQL 5.7, and the duplicate tests
sys_vars.innodb_monitor_* are removed.
2017-03-03 16:55:50 +02:00
Marko Mäkelä
494e4b99a4 Remove MYSQL_TABLESPACES.
MySQL 5.7 introduced partial support for user-created shared tablespaces
(for example, import and export are not supported).

MariaDB Server does not support tablespaces at this point of time.
Let us remove most InnoDB code and data structures that is related
to shared tablespaces.
2017-01-18 08:30:43 +02:00
Marko Mäkelä
c849b7df61 MDEV-11785 Remove INFORMATION_SCHEMA.INNODB_TEMP_TABLE_INFO
The INFORMATION_SCHEMA view INNODB_TEMP_TABLE_INFO was added to
MySQL 5.7 as part of the work to implement temporary tables
without any redo logging.

The only use case of this view was SELECT COUNT(*) in some tests,
to see how many temporary tables exist in InnoDB. The columns do
not report much useful information. For example, the table name
would not be the user-specified table name, but a generated #sql
name. Also, the session that created the table is not identified.
2017-01-17 12:09:23 +02:00
Sergei Golubchik
b66976abb4 cleanup: InnoDB, various minor issues
* fix "unused pending_checkpoint_mutex_key" compiler warning
* clarify/simplify get_field_offset()
* typos, comments
* unused (forgotten) declaration of create_options_are_invalid()
* fix my_error(ER_WRONG_KEY_COLUMN) calls
* crash in row_upd_sec_index_entry()
* double if (ret != 0)
* don't duplucate PSI_INSTRUMENT_ME lines
* useless break; after return();
* remove unused xtradb-only "cursor_read_view" stuff
* code formatting
* simplify dropped column detection
* redundant assignment
2016-12-12 20:27:41 +01:00
Marko Mäkelä
0b66d3f70d MDEV-11426 Remove InnoDB INFORMATION_SCHEMA.FILES implementation
MySQL 5.7 introduced WL#7943: InnoDB: Implement Information_Schema.Files
to provide a long-term alternative for accessing tablespace metadata.
The INFORMATION_SCHEMA.INNODB_* views are considered internal interfaces
that are subject to change or removal between releases. So, users should
refer to I_S.FILES instead of I_S.INNODB_SYS_TABLESPACES to fetch metadata
about CREATE TABLESPACE.

Because MariaDB 10.2 does not support CREATE TABLESPACE or
CREATE TABLE…TABLESPACE for InnoDB, it does not make sense to support
I_S.FILES either. So, let MariaDB 10.2 omit the code that was added in
MySQL 5.7. After this change, I_S.FILES will report the empty result,
unless some other storage engine in MariaDB 10.2 implements the interface.
(The I_S.FILES interface was originally created for the NDB Cluster.)
2016-12-01 13:16:25 +02:00
Monty
14b1c8c80d After merge and bug fixes
- Fixed compiler warnings
- Removed have_debug.inc from innochecksum_3
- Fixed race condition in innodb_buffer_pool_load
- Fixed merge issue in innodb-bad-key-change.test
- Fixed missing array allocation that could cause
  function_defaults_notembedded to fail
- Fixed thread_cache_size_func
2016-10-05 01:11:08 +03:00
Monty
c1125c3218 Fixed compiler warnings and failing tests 2016-10-05 01:11:08 +03:00
Sergei Golubchik
66d9696596 Merge branch '10.0' into 10.1 2016-09-28 17:55:28 +02:00
Sergei Golubchik
3629f62d29 Merge branch 'merge/merge-innodb-5.6' into 10.0 2016-09-27 18:05:06 +02:00
Sergei Golubchik
094f140c9a 5.6.33 2016-09-27 17:56:00 +02:00
Sergei Golubchik
f9bdc7c01a Merge branch '10.2' into bb-10.2-jan 2016-09-19 09:47:08 +02:00
Sergei Golubchik
dc900cc846 Remove a bunch of TODO's, fix perfschema.threads_innodb test 2016-09-11 10:57:05 +02:00
Jan Lindström
fec844aca8 Merge InnoDB 5.7 from mysql-5.7.14.
Contains also:
       MDEV-10549 mysqld: sql/handler.cc:2692: int handler::ha_index_first(uchar*): Assertion `table_share->tmp_table != NO_TMP_TABLE || m_lock_type != 2' failed. (branch bb-10.2-jan)
       Unlike MySQL, InnoDB still uses THR_LOCK in MariaDB

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       enable tests that were fixed in MDEV-10549

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       fix main.innodb_mysql_sync - re-enable online alter for partitioned innodb tables
2016-09-08 15:49:03 +03:00
Sergei Golubchik
61fd38a1de update plugin maturities 2016-09-05 17:11:14 +02:00
Jan Lindström
2e814d4702 Merge InnoDB 5.7 from mysql-5.7.9.
Contains also

MDEV-10547: Test multi_update_innodb fails with InnoDB 5.7

	The failure happened because 5.7 has changed the signature of
	the bool handler::primary_key_is_clustered() const
	virtual function ("const" was added). InnoDB was using the old
	signature which caused the function not to be used.

MDEV-10550: Parallel replication lock waits/deadlock handling does not work with InnoDB 5.7

	Fixed mutexing problem on lock_trx_handle_wait. Note that
	rpl_parallel and rpl_optimistic_parallel tests still
	fail.

MDEV-10156 : Group commit tests fail on 10.2 InnoDB (branch bb-10.2-jan)
  Reason: incorrect merge

MDEV-10550: Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan)
  Reason: incorrect merge
2016-09-02 13:22:28 +03:00
Sergei Golubchik
6b1863b830 Merge branch '10.0' into 10.1 2016-08-25 12:40:09 +02:00
Sergei Golubchik
57fbc603bf Merge branch 'merge/merge-innodb-5.6' into 10.0
5.6.32
2016-08-10 19:43:37 +02:00
Sergei Golubchik
b4f97a1499 5.6.32 2016-08-10 19:23:00 +02:00
Sergei Golubchik
3fdc6140a3 update plugins' maturity levels 2016-03-18 22:05:23 +01:00
Jan Lindström
ee768d8e0e MDEV-9640: Add used key_id to INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION 2016-03-18 11:48:49 +02:00
Sergei Golubchik
5091a4ba75 Merge tag 'mariadb-10.0.19' into 10.1 2015-06-01 15:51:25 +02:00
Sergei Golubchik
ac286a9bc7 Merge branch '5.5' into 10.0 2015-05-08 11:20:43 +02:00