Commit graph

21123 commits

Author SHA1 Message Date
Sergei Golubchik
1ebec10375 InnoDB: fix compilation with -DDBUG_OFF 2018-09-21 13:31:37 +02:00
Marko Mäkelä
81ba90b59e Merge 10.3 into 10.4 2018-09-21 13:16:38 +03:00
Marko Mäkelä
bc7d40d032 Clean up some SPATIAL INDEX code
Clarify some comments about accessing an externally stored column
on which a spatial index has been defined. Add a TODO comment that
we should actually write the minimum bounding rectangle (MBR) to
the undo log record, so that we can avoid fetching BLOBs and recomputing
MBR.

row_build_spatial_index_key(): Split from row_build_index_entry_low().
2018-09-21 09:27:06 +03:00
Marko Mäkelä
45eaed0c4c Merge 10.3 into 10.4 2018-09-19 17:38:00 +03:00
Marko Mäkelä
90b292ce31 Follow-up to MDEV-16328: ALTER TABLE…page_compression_level should not rebuild table
Allow combination of non-instant, non-rebuilding operations with
changes of table options that do not require a rebuild.

For example, DROP INDEX or ADD INDEX can be performed with
ALGORITHM=NOCOPY together with changing such table options.
Changing the table options alone would be allowed with ALGORITHM=INSTANT.

INNOBASE_ALTER_NOCREATE: A new set of flags, for operations that
are refused for ALGORITHM=INSTANT and do not involve creating
index trees.

Move ALTER_RENAME_INDEX to the proper place (INNOBASE_ALTER_INSTANT).

innobase_need_rebuild(): Do not require a rebuild if
INNOBASE_ALTER_NOREBUILD operations are combined with ALTER_OPTIONS.

ha_innobase::prepare_inplace_alter_table(),
ha_innobase::inplace_alter_table(): Use the fast path if
ALTER_OPTIONS is combined with INNOBASE_ALTER_NOCREATE.
In this case, the actual changes would be deferred to
ha_innobase::commit_inplace_alter_table().
2018-09-19 17:29:25 +03:00
Marko Mäkelä
28ae79650d Terminology: 'metadata' not 'default rec'
This follows up to commit 755187c853.

TRX_UNDO_INSERT_METADATA: Renamed from TRX_UNDO_INSERT_DEFAULT

trx_undo_metadata: Renamed from trx_undo_default_rec
2018-09-19 09:12:58 +03:00
Marko Mäkelä
4a246fecbd Merge 10.3 into 10.4 2018-09-19 08:21:03 +03:00
Marko Mäkelä
755187c853 Terminology: 'metadata record' instead of 'default row'
For instant ALTER TABLE, we store a hidden metadata record at the
start of the clustered index, to indicate how the format of the
records differs from the latest table definition.

The term 'default row' is too specific, because it applies to
instant ADD COLUMN only, and we will be supporting more classes
of instant ALTER TABLE later on. For instant ADD COLUMN, we
store the initial default values in the metadata record.
2018-09-19 07:21:24 +03:00
Marko Mäkelä
043639f9b0 Simplify innobase_add_instant_try()
Remove some code duplication and dead code. If no 'default row'
record exists, the root page must be in the conventional format.
Should the page type already be FIL_PAGE_TYPE_INSTANT, we would
necessarily hit a debug assertion failure in page_set_instant().
2018-09-18 21:25:24 +03:00
Jacob Mathew
c9a5804e14 MDEV-17144: Sample of spider_direct_sql cause crash
The crash occurs when the Spider node server attempts to create an error
message stating that the temporary table is not found.  The function to
create the error message is called with incorrect parameters.

I fixed the crash by correcting the incorrect parameter values.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit e339616 on branch bb-10.3-mdev-17144
2018-09-18 11:05:53 -07:00
Jacob Mathew
e33961611a MDEV-17144: Sample of spider_direct_sql cause crash
The crash occurs when the Spider node server attempts to create an error
message stating that the temporary table is not found.  The function to
create the error message is called with incorrect parameters.

I fixed the crash by correcting the incorrect parameter values.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.
2018-09-17 18:48:38 -07:00
Marko Mäkelä
21f310db30 Fix the Windows build 2018-09-17 18:10:37 +03:00
Marko Mäkelä
774a4cb547 Mroonga follow-up fix for MDEV-16328
Now that ha_innobase::prepare_inplace_alter_table() is accessing
ha_alter_info->create_info->option_struct, we must initialize it in
the Mroonga wrapper for ALTER TABLE based on the parsed table options
for the wrap_altered_table.
2018-09-17 17:50:56 +03:00
Marko Mäkelä
ac24289e66 MDEV-16328 ALTER TABLE…page_compression_level should not rebuild table
The table option page_compression_level is something that only
affects future writes, not actually the data format. Therefore,
we can allow instant changes of this option.

Similarly, the table option page_compressed can be set on a
previously uncompressed table without rebuilding the table,
because an uncompressed page would be considered valid when
reading a page_compressed table.

Removing the page_compressed option will continue to require
the table to be rebuilt.

ha_innobase_inplace_ctx::page_compression_level: The requested
page_compression_level at the start of ALTER TABLE, or 0 if
page_compressed=OFF.

alter_options_need_rebuild(): Renamed from
create_option_need_rebuild(). Allow page_compression_level and
page_compressed to be changed as above, without rebuilding the table.

ha_innobase::check_if_supported_inplace_alter(): Allow ALGORITHM=INSTANT
for ALTER_OPTIONS if the table is not to be rebuilt. If rebuild is
needed, set ha_alter_info->unsupported_reason.

innobase_page_compression_try(): Update SYS_TABLES.TYPE according
to the table flags, for an instant change of page_compression_level
or page_compressed.

commit_cache_norebuild(): Adjust dict_table_t::flags, fil_space_t::flags
and (if needed) FSP_SPACE_FLAGS if page_compression_level was specified.
2018-09-17 14:37:39 +03:00
Marko Mäkelä
4fa9eaf454 Remove unused ha_innobase::lock
In MySQL 5.7, a follow-up to WL#6671 removed the unused
fields ha_innobase::lock and INNOBASE_SHARE::lock, but
MariaDB did not remove them, even though a counterpart of
WL#6671 itself was implemented as MDEV-7660 in
commit d665e79c5b.

INNOBASE_SHARE was removed in MDEV-16557. Thus, all that
needs to be removed is the unused member ha_innobase::lock
and related code.

Thanks to Monty (and Valgrind) for noticing that
ha_innobase::lock was uninitialized.
2018-09-17 08:55:56 +03:00
Marko Mäkelä
171fbbb968 Merge 10.3 into 10.4 2018-09-14 15:23:34 +03:00
Marko Mäkelä
aba5c72be2 MDEV-17196 Crash during instant ADD COLUMN with long DEFAULT value
A debug assertion would fail if an instant ADD COLUMN operation
involves splitting the leftmost leaf page and storing a default
value off-page. Another debug assertion could fail if the
default value does not fit in an undo log page.

btr_cur_pessimistic_update(): Invoke rec_offs_make_valid()
in order to prevent rec_offs_validate() assertion failure.

innobase_add_instant_try(): Invoke btr_cur_pessimistic_update()
with the BTR_KEEP_POS_FLAG, which is the correct course of action
when BLOBs may need to be written. Whenever returning true,
ensure that my_error() will have been called.
2018-09-14 15:06:58 +03:00
Oleksandr Byelkin
28f08d3753 Merge branch '10.1' into 10.2 2018-09-14 08:47:22 +02:00
Jacob Mathew
30d225698f MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3 to 10.2 and 10.1.  In 10.3
and 10.4 I have merged the comments and the test case.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit ed49f9a on branch 10.3
2018-09-13 16:07:02 -07:00
Jacob Mathew
ed49f9aae2 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3 to 10.2 and 10.1.  In 10.3
and 10.4 I have merged the comments and the test case.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Merged:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 14:55:46 -07:00
Jacob Mathew
6c47c1c456 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Cherry-Picked:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 13:27:03 -07:00
Jacob Mathew
3866589308 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.

Merged:
  Commit eb2ca3d on branch bb-10.2-MDEV-16912
2018-09-13 11:41:55 -07:00
Sergei Petrunia
d85a7220dc MDEV-17188: rocksdb.2pc_group_commit fails intermittently in BB
When counter increment is not within the expected range, print the number
instead of just FAIL.

This doesnt solve the bug but will help with the diagnostics.
2018-09-13 15:00:13 +03:00
Oleksandr Byelkin
2b46dca5d7 Merge remote-tracking branch 'connect/10.2' into 10.2 2018-09-13 10:09:28 +02:00
Ming Lin
87609324e0 MDEV-16768: fix blob key length
The blob key length could be shorter than the length of the entire blob,
for example,

CREATE TABLE t1 (b BLOB, i INT, KEY(b(8)));
INSERT INTO t1 VALUES (REPEAT('a',9),1);

The key length is 8, while the blob length is 9.
So we need to set the correct key length in Field_blob::sort_string().
2018-09-12 15:15:13 +03:00
Jacob Mathew
eb2ca3d445 MDEV-16912: Spider Order By column[datatime] limit 5 returns 3 rows
The problem occurs in 10.2 and earlier releases of MariaDB Server because the
Partition Engine was not pushing the engine conditions to the underlying
storage engine of each partition.  This caused Spider to return the first 5
rows in the table with the data provided by the customer.  2 of the 5 rows
did not qualify the WHERE clause, so they were removed from the result set by
the server.

To fix the problem, I have back-ported support for engine condition pushdown
in the Partition Engine from MariaDB Server 10.3.

Author:
  Jacob Mathew.

Reviewer:
  Kentoku Shiba.
2018-09-11 16:29:44 -07:00
Marko Mäkelä
5567a8c936 MDEV-17138 Reduce redo log volume for undo tablespace initialization
Implement a 10.4 redo log format, which extends the 10.3 format
by introducing the MLOG_MEMSET record.

MLOG_MEMSET: A new redo log record type for filling an area with a byte.

mlog_memset(): Write the MLOG_MEMSET record.

mlog_parse_nbytes(): Handle MLOG_MEMSET as well.

trx_rseg_header_create(): Reduce the redo log volume by making use of
mlog_memset() and the zero-initialization that happens inside page
allocation.

fil_addr_null: Remove.

flst_init(): Create a variant that takes a zero-initialized
buf_block_t* as a parameter, and only writes the FIL_NULL using
mlog_memset().

flst_zero_addr(): A variant of flst_write_addr() that writes
a null address using mlog_memset() for the FIL_NULL.

The following fixes are replacing some use of MLOG_WRITE_STRING
with the more compact MLOG_MEMSET record, or eliminating
redundant redo log writes:

btr_store_big_rec_extern_fields(): Invoke mlog_memset() for
zero-initializing the tail of the ROW_FORMAT=COMPRESSED BLOB page.

trx_sysf_create(), trx_rseg_format_upgrade(): Invoke mlog_memset()
for zero-initializing the page trailer.

fsp_header_init(), trx_rseg_header_create():
Remove redundant zero-initializations.
2018-09-11 21:32:36 +03:00
Marko Mäkelä
09af00cbde MDEV-13564: Remove old crash-upgrade logic in 10.4
Stop supporting the additional *trunc.log files that were
introduced via MySQL 5.7 to MariaDB Server 10.2 and 10.3.

DB_TABLESPACE_TRUNCATED: Remove.

purge_sys.truncate: A new structure to track undo tablespace
file truncation.

srv_start(): Remove the call to buf_pool_invalidate(). It is
no longer necessary, given that we no longer access things in
ways that violate the ARIES protocol. This call was originally
added for innodb_file_format, and it may later have been necessary
for the proper function of the MySQL 5.7 TRUNCATE recovery, which
we are now removing.

trx_purge_cleanse_purge_queue(): Take the undo tablespace as a
parameter.

trx_purge_truncate_history(): Rewrite everything mostly in a
single function, replacing references to undo::Truncate.

recv_apply_hashed_log_recs(): If any redo log is to be applied,
and if the log_sys.log.subformat indicates that separately
logged truncate may have been used, refuse to proceed except if
innodb_force_recovery is set. We will still refuse crash-upgrade
if TRUNCATE TABLE was logged. Undo tablespace truncation would
only be logged in undo*trunc.log files, which we are no longer
checking for.
2018-09-11 21:32:15 +03:00
Marko Mäkelä
67fa97dc2c Merge 10.3 into 10.4 2018-09-11 21:31:47 +03:00
Marko Mäkelä
1bf3e8ab43 Merge 10.3 into 10.4 2018-09-11 21:31:03 +03:00
Marko Mäkelä
6b61f1bbad MDEV-17161 TRUNCATE TABLE fails after upgrade from 10.1 2018-09-10 16:15:23 +03:00
Marko Mäkelä
fc34e4c067 MDEV-17161 TRUNCATE TABLE fails after upgrade from 10.1
With the TRUNCATE by rename, create, drop (MDEV-13564),
old tables with invalid ROW_FORMAT attribute could not be
truncated. Introduce a sloppy mode for allowing the TRUNCATE.

create_table_info_t::prepare_create_table(): Add the parameter
strict=true.

ha_innobase::create(): Pass strict=false if trx!=NULL
(the create is part of TRUNCATE).
2018-09-10 16:10:26 +03:00
Marko Mäkelä
b02c722e7a MDEV-17158 TRUNCATE is not atomic after MDEV-13564 2018-09-10 15:40:11 +03:00
Marko Mäkelä
75f8e86f57 MDEV-17158 TRUNCATE is not atomic after MDEV-13564
It turned out that ha_innobase::truncate() would prematurely
commit the transaction already before the completion of the
ha_innobase::create(). All of this must be atomic.

innodb.truncate_crash: Use the correct DEBUG_SYNC point, and
tolerate non-truncation of the table, because the redo log
for the TRUNCATE transaction commit might be flushed due to
some InnoDB background activity.

dict_build_tablespace_for_table(): Merge to the function
dict_build_table_def_step().

dict_build_table_def_step(): If a table is being created during
an already started data dictionary transaction (such as TRUNCATE),
persistently write the table_id to the undo log header before
creating any file. In this way, the recovery of TRUNCATE will be
able to delete the new file before rolling back the rename of
the original table.

dict_table_rename_in_cache(): Add the parameter replace_new_file,
used as part of rolling back a TRUNCATE operation.

fil_rename_tablespace_check(): Add the parameter replace_new.
If the parameter is set and a file identified by new_path exists,
remove a possible tablespace and also the file.

create_table_info_t::create_table_def(): Remove some debug assertions
that no longer hold. During TRUNCATE, the transaction will already
have been started (and performed a rename operation) before the
table is created. Also, remove a call to dict_build_tablespace_for_table().

create_table_info_t::create_table(): Add the parameter create_fk=true.
During TRUNCATE TABLE, do not add FOREIGN KEY constraints to the
InnoDB data dictionary, because they will also not be removed.

row_table_add_foreign_constraints(): If trx=NULL, do not modify
the InnoDB data dictionary, but only load the FOREIGN KEY constraints
from the data dictionary.

ha_innobase::create(): Lock the InnoDB data dictionary cache only
if no transaction was passed by the caller. Unlock it in any case.

innobase_rename_table(): Add the parameter commit = true.
If !commit, do not lock or unlock the data dictionary cache.

ha_innobase::truncate(): Lock the data dictionary before invoking
rename or create, and let ha_innobase::create() unlock it and
also commit or roll back the transaction.

trx_undo_mark_as_dict(): Renamed from trx_undo_mark_as_dict_operation()
and declared global instead of static.

row_undo_ins_parse_undo_rec(): If table_id is set, this must
be rolling back the rename operation in TRUNCATE TABLE, and
therefore replace_new_file=true.
2018-09-10 14:59:58 +03:00
Marko Mäkelä
5822d31696 Follow-up to MDEV-13407: Remove fil_wait_crypt_bg_threads() 2018-09-09 11:38:14 +03:00
Marko Mäkelä
99e36a7157 Follow-up to MDEV-13407: Remove fil_wait_crypt_bg_threads()
Tables whose reference count is not zero will be crash-safely
dropped in the background when the count reaches zero. Therefore,
it is no longer necessary to wait for all references to be released
before possibly adding the table to the background queue.
2018-09-09 11:35:02 +03:00
Sergey Vojtovich
d9613b750c Enable C++11 2018-09-09 10:05:56 +04:00
Marko Mäkelä
5a1868b58d MDEV-13564 Mariabackup does not work with TRUNCATE
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:

MDEV-17158 log_write_up_to() sometimes fails
2018-09-07 22:15:06 +03:00
Marko Mäkelä
980d1bf1a9 MDEV-14717: Prevent crash-downgrade to earlier MariaDB 10.2
A crash-downgrade of a RENAME (or TRUNCATE or table-rebuilding
ALTER TABLE or OPTIMIZE TABLE) operation to an earlier 10.2 version
would trigger a debug assertion failure during rollback,
in trx_roll_pop_top_rec_of_trx(). In a non-debug build, the
TRX_UNDO_RENAME_TABLE record would be misinterpreted as an
update_undo log record, and typically the file name would be
interpreted as DB_TRX_ID,DB_ROLL_PTR,PRIMARY KEY. If a matching
record would be found, row_undo_mod() would hit ut_error in
switch (node->rec_type). Typically, ut_a(table2 == NULL) would
fail when opening the table from SQL.

Because of this, we prevent a crash-downgrade to earlier MariaDB 10.2
versions by changing the InnoDB redo log format identifier to the
10.3 identifier, and by introducing a subformat identifier so that
10.2 can continue to refuse crash-downgrade from 10.3 or later.
After a clean shutdown, a downgrade to MariaDB 10.2.13 or later would
still be possible thanks to MDEV-14909. A downgrade to older 10.2
versions is only possible after removing the log files (not recommended).

LOG_HEADER_FORMAT_CURRENT: Change to 103 (originally the 10.3 format).

log_group_t: Add subformat. For 10.2, we will use subformat 1,
and will refuse crash recovery from any other subformat of the
10.3 format, that is, a genuine 10.3 redo log.

recv_find_max_checkpoint(): Allow startup after clean shutdown
from a future LOG_HEADER_FORMAT_10_4 (unencrypted only).
We cannot handle the encrypted 10.4 redo log block format,
which was introduced in MDEV-12041. Allow crash recovery from
the original 10.2 format as well as the new format.
In Mariabackup --backup, do not allow any startup from 10.3 or 10.4
redo logs.

recv_recovery_from_checkpoint_start(): Skip redo log apply for
clean 10.3 redo log, but not for the new 10.2 redo log
(10.3 format, subformat 1).

srv_prepare_to_delete_redo_log_files(): On format or subformat
mismatch, set srv_log_file_size = 0, so that we will display the
correct message.

innobase_start_or_create_for_mysql(): Check for format or subformat
mismatch.

xtrabackup_backup_func(): Remove debug assertions that were made
redundant by the code changes in recv_find_max_checkpoint().
2018-09-07 22:10:03 +03:00
Marko Mäkelä
73ed19e44f MDEV-14585 Automatically remove #sql- tables in InnoDB dictionary during recovery
This is a backport of the following commits:
commit b4165985c9
commit 69e88de0fe
commit 40f4525f43
commit 656f66def2

Now that MDEV-14717 made RENAME TABLE crash-safe within InnoDB,
it should be safe to drop the #sql- tables within InnoDB during
crash recovery. These tables can be one of two things:

(1) #sql-ib related to deferred DROP TABLE (follow-up to MDEV-13407)
or to table-rebuilding ALTER TABLE...ALGORITHM=INPLACE
(since MDEV-14378, only related to the intermediate copy of a table),

(2) #sql- related to the intermediate copy of a table during
ALTER TABLE...ALGORITHM=COPY

We will not drop tables whose name starts with #sql2, because
the server can be killed during an ALGORITHM=COPY operation at
a point where the original table was renamed to #sql2 but the
finished intermediate copy was not yet renamed from #sql-
to the original table name.

If an old version of MariaDB Server before 10.2.13 (MDEV-11415)
was killed while ALTER TABLE...ALGORITHM=COPY was in progress,
after recovery there could be undo log records for some records that were
inserted into an intermediate copy of the table. Due to these undo log
records, InnoDB would resurrect locks at recovery, and the intermediate
table would be locked while we are trying to drop it. This would cause
a call to row_rename_table_for_mysql(), either from
row_mysql_drop_garbage_tables() or from the rollback of a RENAME
operation that was part of the ALTER TABLE.

row_rename_table_for_mysql(): Do not attempt to parse FOREIGN KEY
constraints when renaming from #sql-something to #sql-something-else,
because it does not make any sense.

row_drop_table_for_mysql(): When deferring DROP TABLE due to locks,
do not rename the table if its name already starts with the #sql-
prefix, which is what row_mysql_drop_garbage_tables() uses.
Previously, the too strict prefix #sql-ib was used, and some
tables were renamed unnecessarily.
2018-09-07 22:10:03 +03:00
Marko Mäkelä
8dcacd3b01 Follow-up to MDEV-13407 innodb.drop_table_background failed in buildbot with "Tablespace for table exists"
This is a backport of commit 88aff5f471.

The InnoDB background DROP TABLE queue is something that we should
really remove, but are unable to until we remove dict_operation_lock
so that DDL and DML operations can be combined in a single transaction.

Because the queue is not persistent, it is not crash-safe. We should
in some way ensure that the deferred-dropped tables will be dropped
after server restart.

The existence of two separate transactions complicates the error handling
of CREATE TABLE...SELECT. We should really not break locks in DROP TABLE.

Our solution to these problems is to rename the table to a temporary
name, and to drop such-named tables on InnoDB startup. Also, the
queue will use table IDs instead of names from now on.

check-testcase.test: Ignore #sql-ib*.ibd files, because tables may enter
the background DROP TABLE queue shortly before the test finishes.

innodb.drop_table_background: Test CREATE...SELECT and the creation of
tables whose file name starts with #sql-ib.

innodb.alter_crash: Adjust the recovery, now that the #sql-ib tables
will be dropped on InnoDB startup.

row_mysql_drop_garbage_tables(): New function, to drop all #sql-ib tables
on InnoDB startup.

row_drop_table_for_mysql_in_background(): Remove an unnecessary and
misplaced call to log_buffer_flush_to_disk(). (The call should have been
after the transaction commit. We do not care about flushing the redo log
here, because the table would be dropped again at server startup.)

Remove the entry from the list after the table no longer exists.

If server shutdown has been initiated, empty the list without actually
dropping any tables. They will be dropped again on startup.

row_drop_table_for_mysql(): Do not call lock_remove_all_on_table().
Instead, if locks exist, defer the DROP TABLE until they do not exist.
If the table name does not start with #sql-ib, rename it to that prefix
before adding it to the background DROP TABLE queue.
2018-09-07 22:10:03 +03:00
Marko Mäkelä
754727bb8a MDEV-14378 In ALGORITHM=INPLACE, use a common name for the intermediate tables or partitions
This is a backport of commit 07e9ff1fe1.

Allow DROP TABLE `#mysql50##sql-...._.` to drop tables that were
being rebuilt by ALGORITHM=INPLACE

NOTE: If the server is killed after the table-rebuilding ALGORITHM=INPLACE
commits inside InnoDB but before the .frm file has been replaced, then
the recovery will involve something else than DROP TABLE.

NOTE: If the server is killed in a true inplace ALTER TABLE commits
inside InnoDB but before the .frm file has been replaced, then we
are really out of luck. To properly handle that situation, we would
need a transactional mysql.ddl_fixup table that directs recovery to
rename or remove files.

prepare_inplace_alter_table_dict(): Use the altered_table->s->table_name
for generating the new_table_name.

table_name_t::part_suffix: The start of the partition name suffix.

table_name_t::dbend(): Return the end of the schema name.

table_name_t::dblen(): Return the length of the schema name, in bytes.

table_name_t::basename(): Return the name without the schema name.

table_name_t::part(): Return the partition name, or NULL if none.

row_drop_table_for_mysql(): Assert for #sql, not #sql-ib.
2018-09-07 22:10:03 +03:00
Marko Mäkelä
cf2a4426a2 MDEV-14717 RENAME TABLE in InnoDB is not crash-safe
This is a backport of commit 0bc36758ba
and commit 9eb3fcc9fb.

InnoDB in MariaDB 10.2 appears to only write MLOG_FILE_RENAME2
redo log records during table-rebuilding ALGORITHM=INPLACE operations.
We must write the records for any .ibd file renames, so that the
operations are crash-safe.

If InnoDB is killed during a RENAME TABLE operation, it can happen that
the transaction for updating the data dictionary will be rolled back.
But, nothing will roll back the renaming of the .ibd file
(the MLOG_FILE_RENAME2 only guarantees roll-forward), or for that matter,
the renaming of the dict_table_t::name in the dict_sys cache. We introduce
the undo log record TRX_UNDO_RENAME_TABLE to fix this.

fil_space_for_table_exists_in_mem(): Remove the parameters
adjust_space, table_id and some code that was trying to work around
these deficiencies.

fil_name_write_rename(): Write a MLOG_FILE_RENAME2 record.

dict_table_rename_in_cache(): Invoke fil_name_write_rename().

trx_undo_rec_copy(): Set the first 2 bytes to the length of the
copied undo log record.

trx_undo_page_report_rename(), trx_undo_report_rename():
Write a TRX_UNDO_RENAME_TABLE record with the old table name.

row_rename_table_for_mysql(): Invoke trx_undo_report_rename()
before modifying any data dictionary tables.

row_undo_ins_parse_undo_rec(): Roll back TRX_UNDO_RENAME_TABLE
by invoking dict_table_rename_in_cache(), which will take care
of both renaming the table and the file.

ha_innobase::truncate(): Remove a work-around.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
e67b1070bb MDEV-17049 Enable innodb_undo tests on buildbot
Remove the innodb_undo suite, and move and adapt the tests.
Remove unnecessary restarts, and add innodb_page_size_small.inc
for combinations.

innodb.undo_truncate is the merge of innodb_undo.truncate
and innodb_undo.truncate_multi_client.

Add the global status variable innodb_undo_truncations.
Without this, the test innodb.undo_truncate would occasionally
report that truncation did not happen. The test was only waiting
for the history list length to reach 0, but the undo tablespace
truncation would only take place some time after that.

Undo tablespace truncation will only occasionally occur with
innodb_page_size=32k, and typically never occur (with this amount
of undo log operations) with innodb_page_size=64k. We disable
these combinations.

innodb.undo_truncate_recover was formerly called
innodb_undo.truncate_recover.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
4901f31c13 Merge 10.2 into 10.3 2018-09-07 22:09:28 +03:00
Marko Mäkelä
59950df533 Remove some debug-only global status variables 2018-09-07 22:06:20 +03:00
Marko Mäkelä
0927332961 Make some declarations private
recv_addr_state, recv_addr_t: Define in log0recv.cc only.
2018-09-07 22:06:20 +03:00
Marko Mäkelä
93ed717b3d Relax debug assertions for undo tablespace recovery 2018-09-07 22:06:20 +03:00
Marko Mäkelä
9f6a0d291f row_purge_parse_undo_rec(): Deduplicate some code 2018-09-07 22:06:20 +03:00