Commit graph

114 commits

Author SHA1 Message Date
Marko Mäkelä
2ac8b1a907 MDEV-12266: Add fil_system.sys_space, temp_space
Add fil_system_t::sys_space, fil_system_t::temp_space.
These will replace lookups for TRX_SYS_SPACE or SRV_TMP_SPACE_ID.

mtr_t::m_undo_space, mtr_t::m_sys_space: Remove.

mtr_t::set_sys_modified(): Remove.

fil_space_get_type(), fil_space_get_n_reserved_extents(): Remove.

fsp_header_get_tablespace_size(), fsp_header_inc_size():
Merge to the only caller, innobase_start_or_create_for_mysql().
2018-03-29 19:18:11 +03:00
Marko Mäkelä
33714d2065 Merge bb-10.2-ext into 10.3 2018-01-30 21:04:48 +02:00
Marko Mäkelä
d9c77f0341 Revert "MDEV-6928: Add trx pointer to struct mtr_t"
This reverts commit 3486135bb5.

The commit comment ended in the words: "This is needed later."
Apparently the "later" never arrived.
2018-01-29 15:45:16 +02:00
Marko Mäkelä
a4948dafcd MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has
required a rebuild of the table and all its indexes. Since MySQL 5.6
(and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing
concurrent modification of the tables.

This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT
and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously,
with only minor changes performed to the table structure. The counter
innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS
is incremented whenever a table rebuild operation is converted into
an instant ADD COLUMN operation.

ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN.

Some usability limitations will be addressed in subsequent work:

MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY
and ALGORITHM=INSTANT
MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE

The format of the clustered index (PRIMARY KEY) is changed as follows:

(1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT,
and a new field PAGE_INSTANT will contain the original number of fields
in the clustered index ('core' fields).
If instant ADD COLUMN has not been used or the table becomes empty,
or the very first instant ADD COLUMN operation is rolled back,
the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset
to 0 and FIL_PAGE_INDEX.

(2) A special 'default row' record is inserted into the leftmost leaf,
between the page infimum and the first user record. This record is
distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the
same format as records that contain values for the instantly added
columns. This 'default row' always has the same number of fields as
the clustered index according to the table definition. The values of
'core' fields are to be ignored. For other fields, the 'default row'
will contain the default values as they were during the ALTER TABLE
statement. (If the column default values are changed later, those
values will only be stored in the .frm file. The 'default row' will
contain the original evaluated values, which must be the same for
every row.) The 'default row' must be completely hidden from
higher-level access routines. Assertions have been added to ensure
that no 'default row' is ever present in the adaptive hash index
or in locked records. The 'default row' is never delete-marked.

(3) In clustered index leaf page records, the number of fields must
reside between the number of 'core' fields (dict_index_t::n_core_fields
introduced in this work) and dict_index_t::n_fields. If the number
of fields is less than dict_index_t::n_fields, the missing fields
are replaced with the column value of the 'default row'.
Note: The number of fields in the record may shrink if some of the
last instantly added columns are updated to the value that is
in the 'default row'. The function btr_cur_trim() implements this
'compression' on update and rollback; dtuple::trim() implements it
on insert.

(4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new
status value REC_STATUS_COLUMNS_ADDED will indicate the presence of
a new record header that will encode n_fields-n_core_fields-1 in
1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header
always explicitly encodes the number of fields.)

We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for
covering the insert of the 'default row' record when instant ADD COLUMN
is used for the first time. Subsequent instant ADD COLUMN can use
TRX_UNDO_UPD_EXIST_REC.

This is joint work with Vin Chen (陈福荣) from Tencent. The design
that was discussed in April 2017 would not have allowed import or
export of data files, because instead of the 'default row' it would
have introduced a data dictionary table. The test
rpl.rpl_alter_instant is exactly as contributed in pull request #408.
The test innodb.instant_alter is based on a contributed test.

The redo log record format changes for ROW_FORMAT=DYNAMIC and
ROW_FORMAT=COMPACT are as contributed. (With this change present,
crash recovery from MariaDB 10.3.1 will fail in spectacular ways!)
Also the semantics of higher-level redo log records that modify the
PAGE_INSTANT field is changed. The redo log format version identifier
was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1.

Everything else has been rewritten by me. Thanks to Elena Stepanova,
the code has been tested extensively.

When rolling back an instant ADD COLUMN operation, we must empty the
PAGE_FREE list after deleting or shortening the 'default row' record,
by calling either btr_page_empty() or btr_page_reorganize(). We must
know the size of each entry in the PAGE_FREE list. If rollback left a
freed copy of the 'default row' in the PAGE_FREE list, we would be
unable to determine its size (if it is in ROW_FORMAT=COMPACT or
ROW_FORMAT=DYNAMIC) because it would contain more fields than the
rolled-back definition of the clustered index.

UNIV_SQL_DEFAULT: A new special constant that designates an instantly
added column that is not present in the clustered index record.

len_is_stored(): Check if a length is an actual length. There are
two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL.

dict_col_t::def_val: The 'default row' value of the column.  If the
column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT.

dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(),
instant_value().

dict_col_t::remove_instant(): Remove the 'instant ADD' status of
a column.

dict_col_t::name(const dict_table_t& table): Replaces
dict_table_get_col_name().

dict_index_t::n_core_fields: The original number of fields.
For secondary indexes and if instant ADD COLUMN has not been used,
this will be equal to dict_index_t::n_fields.

dict_index_t::n_core_null_bytes: Number of bytes needed to
represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable).

dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that
n_core_null_bytes was not initialized yet from the clustered index
root page.

dict_index_t: Add the accessors is_instant(), is_clust(),
get_n_nullable(), instant_field_value().

dict_index_t::instant_add_field(): Adjust clustered index metadata
for instant ADD COLUMN.

dict_index_t::remove_instant(): Remove the 'instant ADD' status
of a clustered index when the table becomes empty, or the very first
instant ADD COLUMN operation is rolled back.

dict_table_t: Add the accessors is_instant(), is_temporary(),
supports_instant().

dict_table_t::instant_add_column(): Adjust metadata for
instant ADD COLUMN.

dict_table_t::rollback_instant(): Adjust metadata on the rollback
of instant ADD COLUMN.

prepare_inplace_alter_table_dict(): First create the ctx->new_table,
and only then decide if the table really needs to be rebuilt.
We must split the creation of table or index metadata from the
creation of the dictionary table records and the creation of
the data. In this way, we can transform a table-rebuilding operation
into an instant ADD COLUMN operation. Dictionary objects will only
be added to cache when table rebuilding or index creation is needed.
The ctx->instant_table will never be added to cache.

dict_table_t::add_to_cache(): Modified and renamed from
dict_table_add_to_cache(). Do not modify the table metadata.
Let the callers invoke dict_table_add_system_columns() and if needed,
set can_be_evicted.

dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the
system columns (which will now exist in the dict_table_t object
already at this point).

dict_create_table_step(): Expect the callers to invoke
dict_table_add_system_columns().

pars_create_table(): Before creating the table creation execution
graph, invoke dict_table_add_system_columns().

row_create_table_for_mysql(): Expect all callers to invoke
dict_table_add_system_columns().

create_index_dict(): Replaces row_merge_create_index_graph().

innodb_update_n_cols(): Renamed from innobase_update_n_virtual().
Call my_error() if an error occurs.

btr_cur_instant_init(), btr_cur_instant_init_low(),
btr_cur_instant_root_init():
Load additional metadata from the clustered index and set
dict_index_t::n_core_null_bytes. This is invoked
when table metadata is first loaded into the data dictionary.

dict_boot(): Initialize n_core_null_bytes for the four hard-coded
dictionary tables.

dict_create_index_step(): Initialize n_core_null_bytes. This is
executed as part of CREATE TABLE.

dict_index_build_internal_clust(): Initialize n_core_null_bytes to
NO_CORE_NULL_BYTES if table->supports_instant().

row_create_index_for_mysql(): Initialize n_core_null_bytes for
CREATE TEMPORARY TABLE.

commit_cache_norebuild(): Call the code to rename or enlarge columns
in the cache only if instant ADD COLUMN is not being used.
(Instant ADD COLUMN would copy all column metadata from
instant_table to old_table, including the names and lengths.)

PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields.
This is repurposing the 16-bit field PAGE_DIRECTION, of which only the
least significant 3 bits were used. The original byte containing
PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B.

page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT.

page_ptr_get_direction(), page_get_direction(),
page_ptr_set_direction(): Accessors for PAGE_DIRECTION.

page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION.

page_direction_increment(): Increment PAGE_N_DIRECTION
and set PAGE_DIRECTION.

rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes,
and assume that heap_no is always set.
Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records,
even if the record contains fewer fields.

rec_offs_make_valid(): Add the parameter 'leaf'.

rec_copy_prefix_to_dtuple(): Assert that the tuple is only built
on the core fields. Instant ADD COLUMN only applies to the
clustered index, and we should never build a search key that has
more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR.
All these columns are always present.

dict_index_build_data_tuple(): Remove assertions that would be
duplicated in rec_copy_prefix_to_dtuple().

rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose
number of fields is between n_core_fields and n_fields.

cmp_rec_rec_with_match(): Implement the comparison between two
MIN_REC_FLAG records.

trx_t::in_rollback: Make the field available in non-debug builds.

trx_start_for_ddl_low(): Remove dangerous error-tolerance.
A dictionary transaction must be flagged as such before it has generated
any undo log records. This is because trx_undo_assign_undo() will mark
the transaction as a dictionary transaction in the undo log header
right before the very first undo log record is being written.

btr_index_rec_validate(): Account for instant ADD COLUMN

row_undo_ins_remove_clust_rec(): On the rollback of an insert into
SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the
last column from the table and the clustered index.

row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(),
trx_undo_update_rec_get_update(): Handle the 'default row'
as a special case.

dtuple_t::trim(index): Omit a redundant suffix of an index tuple right
before insert or update. After instant ADD COLUMN, if the last fields
of a clustered index tuple match the 'default row', there is no
need to store them. While trimming the entry, we must hold a page latch,
so that the table cannot be emptied and the 'default row' be deleted.

btr_cur_optimistic_update(), btr_cur_pessimistic_update(),
row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low():
Invoke dtuple_t::trim() if needed.

row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling
row_ins_clust_index_entry_low().

rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number
of fields to be between n_core_fields and n_fields. Do not support
infimum,supremum. They are never supposed to be stored in dtuple_t,
because page creation nowadays uses a lower-level method for initializing
them.

rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the
number of fields.

btr_cur_trim(): In an update, trim the index entry as needed. For the
'default row', handle rollback specially. For user records, omit
fields that match the 'default row'.

btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete():
Skip locking and adaptive hash index for the 'default row'.

row_log_table_apply_convert_mrec(): Replace 'default row' values if needed.
In the temporary file that is applied by row_log_table_apply(),
we must identify whether the records contain the extra header for
instantly added columns. For now, we will allocate an additional byte
for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table
has been subject to instant ADD COLUMN. The ROW_T_DELETE records are
fine, as they will be converted and will only contain 'core' columns
(PRIMARY KEY and some system columns) that are converted from dtuple_t.

rec_get_converted_size_temp(), rec_init_offsets_temp(),
rec_convert_dtuple_to_temp(): Add the parameter 'status'.

REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED:
An info_bits constant for distinguishing the 'default row' record.

rec_comp_status_t: An enum of the status bit values.

rec_leaf_format: An enum that replaces the bool parameter of
rec_init_offsets_comp_ordinary().
2017-10-06 09:50:10 +03:00
Marko Mäkelä
48192f963a Add the parameter bool leaf to rec_get_offsets()
This should affect debug builds only. Debug builds will check that
the status bits of ROW_FORMAT!=REDUNDANT records match the is_leaf
parameter.

The only observable change to non-debug should be the addition of
the is_leaf parameter to the function rec_copy_prefix_to_dtuple(),
and the removal of some calls to update the adaptive hash index
(it is only built for the leaf pages).

This change should have been made in MySQL 5.0.3, instead of
introducing the status flags in the ROW_FORMAT=COMPACT record header.
2017-09-20 16:53:34 +03:00
Marko Mäkelä
6b687a0fde Introduce page_rec_is_leaf() and clean up page0page.h
Define some page accessor functions inline in page0page.h,
reducing code duplication in page0page.ic.

Use page_rec_is_leaf() instead of page_is_leaf() where possible.
2017-09-20 08:42:44 +03:00
Marko Mäkelä
dcdc1c6d09 MDEV-13452 Assertion `!recv_no_log_write' failed in log_reserve_and_open()
The debug flag recv_no_log_write prohibits writes of redo log records for
modifying page data. The debug assertion was failing when fil_names_clear()
was writing the informative MLOG_FILE_NAME and MLOG_CHECKPOINT records
which do not modify any data.

log_reserve_and_open(), log_write_low(): Remove the debug assertion.

log_pad_current_log_block(), mtr_write_log(),
mtr_t::Command::prepare_write(): Add the debug assertion.
2017-08-07 13:54:37 +03:00
Marko Mäkelä
42f657cd2f MDEV-13267 At startup with crash recovery: mtr_t::commit_checkpoint(lsn_t, bool): Assertion `!recv_no_log_write' failed
This is a bogus debug assertion failure that should be possible
starting with MariaDB 10.2.2 (which merged WL#7142 via MySQL 5.7.9).

While generating page-change redo log records is strictly out of the
question during tat certain parts of crash recovery, the
fil_names_clear() is only emitting informational MLOG_FILE_NAME
and MLOG_CHECKPOINT records to guarantee that if the server is killed
during or soon after the crash recovery, subsequent crash recovery
will be possible.

The metadata buffer that fil_names_clear() is flushing to the redo log
is being filled by recv_init_crash_recovery_spaces(), right before
starting to apply redo log, by invoking fil_names_dirty() on every
discovered tablespace for which there are changes to apply.

When it comes to Mariabackup (xtrabackup --prepare), it is strictly out
of the question to generate any redo log whatsoever, because that could
break the restore of incremental backups by causing LSN deviation.
So, the fil_names_dirty() call must be skipped when restoring backups.

recv_recovery_from_checkpoint_start(): Do not invoke fil_names_clear()
when restoring a backup.

mtr_t::commit_checkpoint(): Remove the failing assertion. The only
caller is fil_names_clear(), and it must be called by
recv_recovery_from_checkpoint_start() for normal server startup to be
crash-safe. The debug assertion in mtr_t::commit() will still
catch rogue redo log writes.
2017-07-07 18:40:57 +03:00
Marko Mäkelä
8c71c6aa8b MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2
InnoDB I/O and buffer pool interfaces and the redo log format
have been changed between MariaDB 10.1 and 10.2, and the backup
code has to be adjusted accordingly.

The code has been simplified, and many memory leaks have been fixed.
Instead of the file name xtrabackup_logfile, the file name ib_logfile0
is being used for the copy of the redo log. Unnecessary InnoDB startup and
shutdown and some unnecessary threads have been removed.

Some help was provided by Vladislav Vaintroub.

Parameters have been cleaned up and aligned with those of MariaDB 10.2.

The --dbug option has been added, so that in debug builds,
--dbug=d,ib_log can be specified to enable diagnostic messages
for processing redo log entries.

By default, innodb_doublewrite=OFF, so that --prepare works faster.
If more crash-safety for --prepare is needed, double buffering
can be enabled.

The parameter innodb_log_checksums=OFF can be used to ignore redo log
checksums in --backup.

Some messages have been cleaned up.
Unless --export is specified, Mariabackup will not deal with undo log.
The InnoDB mini-transaction redo log is not only about user-level
transactions; it is actually about mini-transactions. To avoid confusion,
call it the redo log, not transaction log.

We disable any undo log processing in --prepare.

Because MariaDB 10.2 supports indexed virtual columns, the
undo log processing would need to be able to evaluate virtual column
expressions. To reduce the amount of code dependencies, we will not
process any undo log in prepare.

This means that the --export option must be disabled for now.

This also means that the following options are redundant
and have been removed:
	xtrabackup --apply-log-only
	innobackupex --redo-only

In addition to disabling any undo log processing, we will disable any
further changes to data pages during --prepare, including the change
buffer merge. This means that restoring incremental backups should
reliably work even when change buffering is being used on the server.
Because of this, preparing a backup will not generate any further
redo log, and the redo log file can be safely deleted. (If the
--export option is enabled in the future, it must generate redo log
when processing undo logs and buffered changes.)

In --prepare, we cannot easily know if a partial backup was used,
especially when restoring a series of incremental backups. So, we
simply warn about any missing files, and ignore the redo log for them.

FIXME: Enable the --export option.

FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write
a test that initiates a backup while an ALGORITHM=INPLACE operation
is creating indexes or rebuilding a table. An error should be detected
when preparing the backup.

FIXME: In --incremental --prepare, xtrabackup_apply_delta() should
ensure that if FSP_SIZE is modified, the file size will be adjusted
accordingly.
2017-07-05 11:43:28 +03:00
Marko Mäkelä
3e1d0ff574 Fix a merge error in commit 8f643e2063
A merge error caused InnoDB bootstrap to fail when
innodb_undo_tablespaces was set to more than 2.
This was because of a bug that was introduced to
srv_undo_tablespaces_init() by the merge.

Furthermore, some adjustments for Oracle Bug#25551311 aka
Bug#23517560 changes were forgotten. We must minimize direct
references to srv_undo_tablespaces_open and use predicates
instead.

srv_undo_tablespaces_init(): Increment srv_undo_tablespaces_open
once, not twice, per loop iteration.

is_system_or_undo_tablespace(): Remove (unused function).

is_predefined_tablespace(): Invoke srv_is_undo_tablespace().
2017-06-27 21:23:12 +03:00
Thirunarayanan Balathandayuthapani
2ef1baa75f Bug #24793413 LOG PARSING BUFFER OVERFLOW
Problem:
========
During checkpoint, we are writing all MLOG_FILE_NAME records in one mtr
and parse buffer can't be processed till MLOG_MULTI_REC_END. Eventually parse
buffer exceeds the RECV_PARSING_BUF_SIZE and eventually it overflows.

Fix:
===
1) Break the large mtr if it exceeds LOG_CHECKPOINT_FREE_PER_THREAD into multiple mtr during checkpoint.
2) Move the parsing buffer if we are encountering only MLOG_FILE_NAME
records. So that it will never exceed the RECV_PARSING_BUF_SIZE.

Reviewed-by: Debarun Bannerjee <debarun.bannerjee@oracle.com>
Reviewed-by: Rahul M Malik <rahul.m.malik@oracle.com>
RB: 14743
2017-04-26 23:03:32 +03:00
Marko Mäkelä
4e1116b2c6 MDEV-12271 Port MySQL 8.0 Bug#23150562 REMOVE UNIV_MUST_NOT_INLINE AND UNIV_NONINL
Also, remove empty .ic files that were not removed by my MySQL commit.

Problem:
InnoDB used to support a compilation mode that allowed to choose
whether the function definitions in .ic files are to be inlined or not.
This stopped making sense when InnoDB moved to C++ in MySQL 5.6
(and ha_innodb.cc started to #include .ic files), and more so in
MySQL 5.7 when inline methods and functions were introduced
in .h files.

Solution:
Remove all references to UNIV_NONINL and UNIV_MUST_NOT_INLINE from
all files, assuming that the symbols are never defined.
Remove the files fut0fut.cc and ut0byte.cc which only mattered when
UNIV_NONINL was defined.
2017-03-17 12:42:07 +02:00
Marko Mäkelä
ff8bf6e933 Define a mtr_t::start() wrapper inline. 2017-03-13 18:11:01 +02:00
Marko Mäkelä
ad0c218a44 Merge 10.0 into 10.1
Also, implement MDEV-11027 a little differently from 5.5 and 10.0:

recv_apply_hashed_log_recs(): Change the return type back to void
(DB_SUCCESS was always returned).

Report progress also via systemd using sd_notifyf().
2017-03-09 08:53:08 +02:00
Marko Mäkelä
89d80c1b0b Fix many -Wconversion warnings.
Define my_thread_id as an unsigned type, to avoid mismatch with
ulonglong.  Change some parameters to this type.

Use size_t in a few more places.

Declare many flag constants as unsigned to avoid sign mismatch
when shifting bits or applying the unary ~ operator.

When applying the unary ~ operator to enum constants, explictly
cast the result to an unsigned type, because enum constants can
be treated as signed.

In InnoDB, change the source code line number parameters from
ulint to unsigned type. Also, make some InnoDB functions return
a narrower type (unsigned or uint32_t instead of ulint;
bool instead of ibool).
2017-03-07 19:07:27 +02:00
Vicențiu Ciorbaru
1acfa942ed Merge branch '5.5' into 10.0 2017-03-03 01:37:54 +02:00
Jan Lindström
108b211ee2 Fix gcc 6.3.x compiler warnings.
These are caused by fact that functions are declared with
__attribute__((nonnull)) or left shit like ~0 << macro
when ~0U << macro should be used.
2017-02-16 12:02:31 +02:00
Marko Mäkelä
8780b89529 MDEV-11831 Make InnoDB mini-transaction memo checks stricter
InnoDB keeps track of buffer-fixed buf_block_t or acquired rw_lock_t
within a mini-transaction. There are some memo_contains assertions
in the code that document when certain blocks or rw_locks must be held.
But, these assertions only check the mini-transaction memo, not the fact
whether the rw_lock_t are actually being held by the caller.

btr_pcur_store_position(): Remove #ifdef, and assert that the block
is always buffer-fixed.

rtr_pcur_getnext_from_path(), rtr_pcur_open_low(),
ibuf_rec_get_page_no_func(), ibuf_rec_get_space_func(),
ibuf_rec_get_info_func(), ibuf_rec_get_op_type_func(),
ibuf_build_entry_from_ibuf_rec_func(), ibuf_rec_get_volume_func(),
ibuf_get_merge_page_nos_func(), ibuf_get_volume_buffered_count_func()
ibuf_get_entry_counter_low_func(), page_set_ssn_id(),
row_vers_old_has_index_entry(), row_vers_build_for_consistent_read(),
row_vers_build_for_semi_consistent_read(),
trx_undo_prev_version_build():
Make use of mtr_memo_contains_page_flagged().

mtr_t::memo_contains(): Take a const memo. Assert rw_lock_own().

FindPage, FlaggedCheck: Assert rw_lock_own_flagged().
2017-01-18 14:57:10 +02:00
Marko Mäkelä
63574f1275 MDEV-11690 Remove UNIV_HOTBACKUP
The InnoDB source code contains quite a few references to a closed-source
hot backup tool which was originally called InnoDB Hot Backup (ibbackup)
and later incorporated in MySQL Enterprise Backup.

The open source backup tool XtraBackup uses the full database for recovery.
So, the references to UNIV_HOTBACKUP are only cluttering the source code.
2016-12-30 16:05:42 +02:00
Marko Mäkelä
c868acdf65 MDEV-11487 Revert InnoDB internal temporary tables from WL#7682
WL#7682 in MySQL 5.7 introduced the possibility to create light-weight
temporary tables in InnoDB. These are called 'intrinsic temporary tables'
in InnoDB, and in MySQL 5.7, they can be created by the optimizer for
sorting or buffering data in query processing.

In MariaDB 10.2, the optimizer temporary tables cannot be created in
InnoDB, so we should remove the dead code and related data structures.
2016-12-09 12:05:07 +02:00
Jan Lindström
fec844aca8 Merge InnoDB 5.7 from mysql-5.7.14.
Contains also:
       MDEV-10549 mysqld: sql/handler.cc:2692: int handler::ha_index_first(uchar*): Assertion `table_share->tmp_table != NO_TMP_TABLE || m_lock_type != 2' failed. (branch bb-10.2-jan)
       Unlike MySQL, InnoDB still uses THR_LOCK in MariaDB

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       enable tests that were fixed in MDEV-10549

       MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan)
       fix main.innodb_mysql_sync - re-enable online alter for partitioned innodb tables
2016-09-08 15:49:03 +03:00
Jan Lindström
2e814d4702 Merge InnoDB 5.7 from mysql-5.7.9.
Contains also

MDEV-10547: Test multi_update_innodb fails with InnoDB 5.7

	The failure happened because 5.7 has changed the signature of
	the bool handler::primary_key_is_clustered() const
	virtual function ("const" was added). InnoDB was using the old
	signature which caused the function not to be used.

MDEV-10550: Parallel replication lock waits/deadlock handling does not work with InnoDB 5.7

	Fixed mutexing problem on lock_trx_handle_wait. Note that
	rpl_parallel and rpl_optimistic_parallel tests still
	fail.

MDEV-10156 : Group commit tests fail on 10.2 InnoDB (branch bb-10.2-jan)
  Reason: incorrect merge

MDEV-10550: Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan)
  Reason: incorrect merge
2016-09-02 13:22:28 +03:00
Sergei Golubchik
3361aee591 Merge branch '10.0' into 10.1 2016-06-28 22:01:55 +02:00
Sergei Golubchik
720e04ff67 5.6.31 2016-06-21 14:21:03 +02:00
Sergei Golubchik
6d06fbbd1d move to storage/innobase 2015-05-04 19:17:21 +02:00
Monty
d7d589dc01 Push for testing of encryption 2015-02-10 10:21:17 +01:00
Sergei Golubchik
15ee97214d InnoDB 5.6.15 merge.
update test results
2014-02-26 19:36:33 +01:00
Sergei Golubchik
27d45e4696 MDEV-5574 Set AUTO_INCREMENT below max value of column.
Update InnoDB to 5.6.14
Apply MySQL-5.6 hack for MySQL Bug#16434374
Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB)
Fix InnoDB memory leak
2014-02-01 09:33:26 +01:00
Murthy Narkedimilli
b292b5d2e3 Fixing the bug 16919882 - WRONG FSF ADDRESS IN LICENSES HEADERS 2013-06-10 22:29:41 +02:00
Michael Widenius
068c61978e Temporary commit of 10.0-merge 2013-03-26 00:03:13 +02:00
Marko Mäkelä
49adfa3d19 Bug#16138582 MTR_MEMO_RELEASE AND DYN_ARRAY TOGETHER ARE VERY INEFFICIENT
Get rid of O(n^2) scan in dyn array (mtr->memo) operations, accessing
the dyn array blocks directly.

dyn_array_get_last_block(), dyn_array_get_next_block(),
dyn_array_get_prev_block(): Define as a constness-preserving macro.

Add const qualifiers to many dyn_array functions.

mtr_memo_slot_release_func(): Renamed from mtr_memo_slot_release():
Make mtr_t* a debug-only parameter. Assume that slot->object != NULL.

mtr_memo_pop_all(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n).

mtr_memo_release(): Access the dyn_array blocks directly, replacing
O(n^2) operation with O(n). This caused the performance problem.

rb#1540 approved by Jimmy Yang
2013-01-17 17:30:13 +02:00
Michael Widenius
1d0f70c2f8 Temporary commit of merge of MariaDB 10.0-base and MySQL 5.6 2012-08-01 17:27:34 +03:00
Marko Mäkelä
4ea57c80b2 Merge mysql-5.1 to mysql-5.5. 2012-02-17 11:52:51 +02:00
Marko Mäkelä
6ff320e0f6 Merge mysql-5.1 to mysql-5.5. 2012-02-16 12:28:49 +02:00
Marko Mäkelä
91b5e9352a Revert most of revno 3560.9.1 (Bug#12704861)
This was an attempt to address problems with the Bug#12612184 fix.
Even with this follow-up fix, crash recovery can be broken.
Let us fix the bug later.
2011-10-26 11:44:28 +03:00
Inaam Rana
c95fdd936e Revert original fix for Bug 12612184 and the follow up fix for
Bug 12704861.

Bug 12704861 fix was revno: 3504.1.1 (rb://693)
Bug 12612184 fix was revno: 3445.1.10 (rb://678)
2011-09-30 07:02:19 -04:00
Daniel Fischer
7450044eb7 merge from 5.5.16 2011-09-21 12:40:41 +02:00
unknown
40761a9a73 Merge from mysql-5.1.59-release 2011-09-15 18:48:54 +02:00
Marko Mäkelä
7c20cd1b5d Merge mysql-5.1 to mysql-5.5. 2011-08-29 11:22:43 +03:00
Marko Mäkelä
41f229cd9e Bug#12704861 Corruption after a crash during BLOB update
The fix of Bug#12612184 broke crash recovery. When a record that
contains off-page columns (BLOBs) is updated, we must first write redo
log about the BLOB page writes, and only after that write the redo log
about the B-tree changes. The buggy fix would log the B-tree changes
first, meaning that after recovery, we could end up having a record
that contains a null BLOB pointer.

Because we will be redo logging the writes off the off-page columns
before the B-tree changes, we must make sure that the pages chosen for
the off-page columns are free both before and after the B-tree
changes. In this way, the worst thing that can happen in crash
recovery is that the BLOBs are written to free pages, but the B-tree
changes are not applied. The BLOB pages would correctly remain free in
this case. To achieve this, we must allocate the BLOB pages in the
mini-transaction of the B-tree operation. A further quirk is that BLOB
pages are allocated from the same file segment as leaf pages. Because
of this, we must temporarily "hide" any leaf pages that were freed
during the B-tree operation by "fake allocating" them prior to writing
the BLOBs, and freeing them again before the mtr_commit() of the
B-tree operation, in btr_mark_freed_leaves().

btr_cur_mtr_commit_and_start(): Remove this faulty function that was
introduced in the Bug#12612184 fix. The problem that this function was
trying to address was that when we did mtr_commit() the BLOB writes
before the mtr_commit() of the update, the new BLOB pages could have
overwritten clustered index B-tree leaf pages that were freed during
the update. If recovery applied the redo log of the BLOB writes but
did not see the log of the record update, the index tree would be
corrupted. The correct solution is to make the freed clustered index
pages unavailable to the BLOB allocation. This function is also a
likely culprit of InnoDB hangs that were observed when testing the
Bug#12612184 fix.

btr_mark_freed_leaves(): Mark all freed clustered index leaf pages of
a mini-transaction allocated (nonfree=TRUE) before storing the BLOBs,
or freed (nonfree=FALSE) before committing the mini-transaction.

btr_freed_leaves_validate(): A debug function for checking that all
clustered index leaf pages that have been marked free in the
mini-transaction are consistent (have not been zeroed out).

btr_page_alloc_low(): Refactored from btr_page_alloc(). Return the
number of the allocated page, or FIL_NULL if out of space. Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or if this is a "fake allocation"
(init_mtr=NULL) by btr_mark_freed_leaves(nonfree=TRUE).

btr_page_alloc(): Add the parameter init_mtr, allowing the page to be
initialized and X-latched in a different mini-transaction than the one
that is used for the allocation. Invoke btr_page_alloc_low(). If a
clustered index leaf page was previously freed in mtr, remove it from
the memo of previously freed pages.

btr_page_free(): Assert that the page is a B-tree page and it has been
X-latched by the mini-transaction. If the freed page was a leaf page
of a clustered index, link it by a MTR_MEMO_FREE_CLUST_LEAF marker to
the mini-transaction.

btr_store_big_rec_extern_fields_func(): Add the parameter alloc_mtr,
which is NULL (old behaviour in inserts) and the same as local_mtr in
updates. If alloc_mtr!=NULL, the BLOB pages will be allocated from it
instead of the mini-transaction that is used for writing the BLOBs.

fsp_alloc_from_free_frag(): Refactored from
fsp_alloc_free_page(). Allocate the specified page from a partially
free extent.

fseg_alloc_free_page_low(), fseg_alloc_free_page_general(): Add the
parameter "mtr_t* init_mtr" for specifying the mini-transaction where
the page should be initialized, or NULL if this is a "fake allocation"
that prevents the reuse of a previously freed B-tree page for BLOB
storage. If init_mtr==NULL, try harder to reallocate the specified page
and assert that it succeeded.

fsp_alloc_free_page(): Add the parameter "mtr_t* init_mtr" for
specifying the mini-transaction where the page should be initialized.
Do not allow init_mtr == NULL, because this function is never to be
used for "fake allocations".

mtr_t: Add the operation MTR_MEMO_FREE_CLUST_LEAF and the flag
mtr->freed_clust_leaf for quickly determining if any
MTR_MEMO_FREE_CLUST_LEAF operations have been posted.

row_ins_index_entry_low(): When columns are being made off-page in
insert-by-update, invoke btr_mark_freed_leaves(nonfree=TRUE) and pass
the mini-transaction as the alloc_mtr to
btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

row_build(): Correct a comment, and add a debug assertion that a
record that contains NULL BLOB pointers must be a fresh insert.

row_upd_clust_rec(): When columns are being moved off-page, invoke
btr_mark_freed_leaves(nonfree=TRUE) and pass the mini-transaction as
the alloc_mtr to btr_store_big_rec_extern_fields(). Finally, invoke
btr_mark_freed_leaves(nonfree=FALSE) to avoid leaking pages.

buf_reset_check_index_page_at_flush(): Remove. The function
fsp_init_file_page_low() already sets
bpage->check_index_page_at_flush=FALSE.

There is a known issue in tablespace extension. If the request to
allocate a BLOB page leads to the tablespace being extended, crash
recovery could see BLOB writes to pages that are off the tablespace
file bounds. This should trigger an assertion failure in fil_io() at
crash recovery. The safe thing would be to write redo log about the
tablespace extension to the mini-transaction of the BLOB write, not to
the mini-transaction of the record update. However, there is no redo
log record for file extension in the current redo log format.

rb:693 approved by Sunny Bains
2011-08-29 11:16:42 +03:00
Marko Mäkelä
c8163a16fa Merge mysql-5.1-security to mysql-5.5-security. 2011-08-10 15:03:33 +03:00
Marko Mäkelä
f2f4b19678 Bug#12626794 61240: UNUSED FUNCTIONS ... 2011-08-10 14:56:14 +03:00
Marko Mäkelä
53e9aabe12 Bug#12606344 - ADD VALGRIND DIAGNOSTICS TO MTR_START, MTR_COMMIT
mtr_start(): Declare the mtr memory area uninitialized in Valgrind
before initializing the fields.

mtr_commit(): Declare everything uninitialized except
mtr->start_lsn, mtr->end_lsn and mtr->state.
2011-05-31 10:55:29 +03:00
Marko Mäkelä
85dbe88d2f Bug#11766305 - 59392: Remove thr0loc.c and ibuf_inside() [part 4 of 4]
ibuf_inside(), ibuf_enter(), ibuf_exit(): Add the parameter mtr. The
flag is no longer kept in the thread-local storage but in the
mini-transaction (mtr->inside_ibuf).

mtr_start(): Clean up the comment and remove the unused return value.
mtr_commit(): Assert !ibuf_inside(mtr) in debug builds.

ibuf_mtr_start(): Like mtr_start(), but sets the flag.
ibuf_mtr_commit(), ibuf_btr_pcur_commit_specify_mtr(): Wrappers that
assert ibuf_inside().

buf_page_get_zip(), buf_page_init_for_read(),
buf_read_ibuf_merge_pages(), fil_io(), ibuf_free_excess_pages(),
ibuf_contract_ext(): Remove assertions on ibuf_inside(), because a
mini-transaction is not available.

buf_read_ahead_linear(): Add the parameter inside_ibuf.

ibuf_restore_pos(): When this function returns FALSE, it commits mtr
and must therefore do ibuf_exit(mtr).

ibuf_delete_rec(): This function commits mtr and must therefore do
ibuf_exit(mtr).

ibuf_rec_get_page_no(), ibuf_rec_get_space(), ibuf_rec_get_info(),
ibuf_rec_get_op_type(), ibuf_build_entry_from_ibuf_rec(),
ibuf_rec_get_volume(), ibuf_get_merge_page_nos(),
ibuf_get_volume_buffered_count(), ibuf_get_entry_counter_low(): Add
the parameter mtr in debug builds, for asserting ibuf_inside(mtr).

rb:585 approved by Sunny Bains
2011-03-24 14:00:14 +02:00
Marko Mäkelä
12a54ac5d7 Merge mysql-5.1-innodb to mysql-5.5-innodb. 2011-01-25 12:35:35 +02:00
Sunny Bains
424250c26c Remove code that was added during the flush list mutex refactoring. We cannot
release a dirty page in the middle of a mini-transaction. Replace the code
with an assertion that checks for this condition.

Original svn revision was: r6330.
2010-07-22 14:46:12 +10:00
Marko Mäkelä
c1567ecebd Bug#54728: Replace the dulint struct with a 64-bit integer. 2010-06-23 14:06:59 +03:00
Vasil Dimov
c7525a0130 Merge from innodb-branches-innodb+ (2) 2010-04-19 20:53:16 +03:00
Vasil Dimov
c877ff39bc Import branches/innodb+ from SVN on top of storage/innobase. 2010-04-12 18:20:41 +03:00
Satya B
d63eb541f5 Merging Innodb plugin 1.0.5 revisions from 5.1-main from revisions 3149 to 3163
also merged missing Innodb plugin revisions r5636,r5635 manually
2009-10-16 17:28:02 +05:30