Commit graph

5770 commits

Author SHA1 Message Date
Marko Mäkelä
50999738ea Merge 10.1 into 10.2 2019-05-13 18:48:28 +03:00
Marko Mäkelä
b93ecea65c Remove unnecessary pointer indirection for rw_lock_t
In MySQL 5.7.8 an extra level of pointer indirection was added to
dict_operation_lock and some other rw_lock_t without solid justification,
in mysql/mysql-server@52720f1772.

Let us revert that change and remove the rather useless rw_lock_t
constructor and destructor and the magic_n field. In this way,
some unnecessary pointer dereferences and heap allocation will be avoided
and debugging might be a little easier.
2019-05-13 18:46:12 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Marko Mäkelä
2647fd101d MDEV-19445 heap-use-after-free related to innodb_ft_aux_table
Try to fix the race conditions between
SET GLOBAL innodb_ft_aux_table = ...;
and access to the INFORMATION_SCHEMA tables that depend on
this variable.

innodb_ft_aux_table: Replaces
fts_internal_tbl_name,fts_internal_tbl_name2. Just store the
user-specified parameter as is.

innodb_ft_aux_table_id: The table_id corresponding to
SET GLOBAL innodb_ft_aux_table, or 0 if the table does not exist
or does not contain FULLTEXT INDEX. If the table is renamed later,
the INFORMATION_SCHEMA tables will continue to refer to the table.
If the table is dropped or rebuilt, the INFORMATION_SCHEMA tables
will not find the table.
2019-05-13 17:16:42 +03:00
Marko Mäkelä
1c97e07f8f fts_optimize_words(): Remove stray output
With SET GLOBAL innodb_optimize_fulltext_only=1
in effect, OPTIMIZE TABLE would output words from the fulltext index
to the server error log, even in non-debug builds.

fts_optimize_words(): Remove the unwanted output.
2019-05-13 17:14:47 +03:00
Marko Mäkelä
c7c54ce606 fts_doc_ids_free(): Define inline 2019-05-13 11:32:20 +03:00
Marko Mäkelä
7f7211073c MDEV-19441 Typo in error message "InnoDB: FTS Doc ID must be large than"
row_insert_for_mysql(): Correct the grammar error, and
display the table name in both messages.
2019-05-13 08:54:43 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
5543b75550 Update FSF Address
* Update wrong zip-code
2019-05-11 21:29:06 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Vicențiu Ciorbaru
f177f125d4 Merge branch '5.5' into 10.1 2019-05-11 19:15:57 +03:00
Vicențiu Ciorbaru
15f1e03d46 Follow-up to changing FSF address
Some places didn't match the previous rules, making the Floor
address wrong.

Additional sed rules:

sed -i -e 's/Place.*Suite .*, Boston/Street, Fifth Floor, Boston/g'
sed -i -e 's/Suite .*, Boston/Fifth Floor, Boston/g'
2019-05-11 18:30:45 +03:00
Marko Mäkelä
8ce702aa90 MDEV-17540 Server crashes in row_purge after TRUNCATE TABLE
row_purge_upd_exist_or_extern_func(): Check for node->vcol_op_failed()
after row_purge_remove_sec_if_poss(), like row_purge_del_mark() did.
This avoids us dereferencing the node->table=NULL pointer.

The test case, submitted by Elena Stepanova, is not deterministic and
does not repeat the bug on 10.2. With the added loop, for me, it reliably
crashes 10.3 without the fix. I was unable to create a deterministic
test case for either 10.2 or 10.3.

Reviewed by Thirunarayanan Balathandayuthapani
2019-05-10 10:50:35 +03:00
Marko Mäkelä
b2f3755c8e Merge 10.1 into 10.2 2019-05-10 08:02:21 +03:00
Thirunarayanan Balathandayuthapani
3e8cab51cb MDEV-13893 encryption.innodb-redo-badkey failed in buildbot with page cannot be decrypted
buf_dblwr_process(): Remove the useless warning that a copy of a page
in the doublewrite buffer is corrupted. We already report an error if a
corrupted page cannot be recovered from the doublewrite buffer.

Note: In MariaDB 10.1, the original bug reported in MDEV-13893 could
still be easily repeatable. In MariaDB 10.2.24, MDEV-12699 should
have reduced the probability considerably.
2019-05-10 07:57:01 +03:00
Marko Mäkelä
542f32649b MDEV-18220: race condition in fts_get_table_name()
fts_get_table_name(): Add the parameter bool dict_locked=false.
2019-05-10 07:57:01 +03:00
Marko Mäkelä
f3718a112a MDEV-18220: Backport some code from MariaDB 10.2
fts_get_table_name(): Output to a caller-allocated buffer.

fts_get_table_name_prefix(): Use the lower-overhead allocation
ut_malloc() instead of mem_alloc().

This is based on mysql/mysql-server@d1584b9f38
in MySQL 5.7.4.
2019-05-10 07:57:01 +03:00
Marko Mäkelä
f92749ed36 MDEV-18220: heap-use-after-free in fts_get_table_name_prefix()
fts_table_t::parent: Remove the redundant field. Refer to
table->name.m_name instead.

fts_update_sync_doc_id(), fts_update_next_doc_id(): Remove
the redundant parameter table_name.

fts_get_table_name_prefix(): Access the dict_table_t::name.
FIXME: Ensure that this access is always covered by
dict_sys->mutex.
2019-05-10 07:57:01 +03:00
Marko Mäkelä
5b3f7c0c33 MDEV-18220: Remove some redundant data structures
fts_state_t, fts_slot_t::state: Remove. Replaced by fts_slot_t::running
and fts_slot_t::table_id as follows.

FTS_STATE_SUSPENDED: Removed (unused).

FTS_STATE_EMPTY: Removed. table_id=0 will denote empty slots.

FTS_STATE_RUNNING: Equivalent to running=true.

FTS_STATE_LOADED, FTS_STATE_DONE: Equivalent to running=false.

fts_slot_t::table: Remove. Tables will be identified by table_id.
After opening a table, we will check fil_table_accessible() before
accessing the data.

fts_optimize_new_table(), fts_optimize_del_table(),
fts_optimize_how_many(), fts_is_sync_needed():
Remove the parameter tables, and use the static variable fts_slots
(which was introduced in MariaDB 10.2) instead.
2019-05-10 07:56:55 +03:00
Eugene Kosov
06442e3e9f MDEV-19399 do not call slow my_timer_init() several times
No functional change.

Call my_timer_init() only once and then reuse it from InnoDB and
perfschema storage engines.

This patch speeds up empty test for me like this:
./mtr -mem innodb.kevg,xtradb  1.21s user 0.84s system 34% cpu 5.999 total
./mtr -mem innodb.kevg,xtradb  1.12s user 0.60s system 31% cpu 5.385 total
2019-05-10 07:56:55 +03:00
Sergey Vojtovich
d0b73fb8d3 MDEV-16060 - InnoDB: Failing assertion: ut_strcmp(index->name, key->name)
A sequel to 9180e86 and 149b754.

ALTER TABLE ... ADD FOREIGN KEY may crash if parent table is updated
concurrently.

Block FK parent table updates even earlier, before intermediate child
table is created.

Use proper charset info for my_casedn_str() and don't update original
identifiers so that lower_cast_table_names == 2 is honoured.
2019-05-09 11:13:44 +04:00
Marko Mäkelä
e0271a7b43 MDEV-19408 Assertion on trx->state failed in ReadView::copy_trx_ids
ReadView::copy_trx_ids(): Relax a debug check. It failed to account for
TRX_STATE_PREPARED_RECOVERED, which was introduced in MDEV-15772.
It was also reading trx->state twice and failed to tolerate
TRX_STATE_COMMITTED_IN_MEMORY, which could be concurrently assigned
in lock_trx_release_locks(), which is not holding trx_sys->mutex.

This bug is specific to the MariaDB 10.2 series. The ReadView was
introduced in MariaDB 10.2.2 by merging the code that had been
introduced in MySQL 5.7.2. In MariaDB 10.3, ReadView::snapshot()
would use the lock-free trx_sys.rw_trx_hash. MDEV-14638 moved the
corresponding assertion to trx_sys_t::find(), where it was duly
protected by trx->mutex, and later MDEV-14756 moved the check to
rw_trx_hash_t::validate_element(). This check was correctly adjusted
when MDEV-15772 was merged to 10.3.
2019-05-08 07:01:20 +03:00
qingda2019
1214674b71 MDEV-13942 InnoDB SPATIAL INDEX corruption during root page split
The problem is in rtr_adjust_upper_level(), which allocates node_ptr
from heap, and then passes the same heap to btr_cur_pessimistic_insert().
The documentation of btr_cur_pessimistic_insert() says that the heap can
be emptied. If the heap is emptied and something else is allocated from
the heap, the node_ptr can become corrupted.
2019-05-07 10:44:00 +03:00
Oleksandr Byelkin
633946fb63 Merge branch '10.1' into 10.2 2019-05-06 18:07:40 +02:00
Marko Mäkelä
0573744a83 Revert "MDEV-19399 do not call slow my_timer_init() several times"
This reverts commit 8dc670a5e8.

The symbol sys_timer_info was not being exported correctly,
which caused linking failures on some platforms.
2019-05-06 17:15:32 +03:00
Eugene Kosov
c83f837053 MDEV-18214 remove some duplicated MONITOR counters
MONITOR_PENDING_LOG_WRITE
MONITOR_PENDING_CHECKPOINT_WRITE
MONITOR_LOG_IO: read values from log_t members instead of updating own
monitor variables
2019-05-06 16:00:15 +03:00
Eugene Kosov
8dc670a5e8 MDEV-19399 do not call slow my_timer_init() several times
No functional change.

Call my_timer_init() only once and then reuse it from InnoDB and
perfschema storage engines.

This patch speeds up empty test for me like this:
./mtr -mem innodb.kevg,xtradb  1.21s user 0.84s system 34% cpu 5.999 total
./mtr -mem innodb.kevg,xtradb  1.12s user 0.60s system 31% cpu 5.385 total
2019-05-06 15:38:02 +03:00
Oleksandr Byelkin
8cbb14ef5d Merge branch '10.1' into 10.2 2019-05-04 17:04:55 +02:00
Marko Mäkelä
ce195987c3 MDEV-19385: Inconsistent definition of dtuple_get_nth_v_field()
The accessor dtuple_get_nth_v_field() was defined differently between
debug and release builds in MySQL 5.7.8 in
mysql/mysql-server@c47e1751b7
and a debug assertion to document or enforce the questionable assumption
tuple->v_fields == &tuple->fields[tuple->n_fields] was missing.

This was apparently no problem until MDEV-11369 introduced instant
ADD COLUMN to MariaDB Server 10.3. With that work present, in one
test case, trx_undo_report_insert_virtual() could in release builds
fetch the wrong value for a virtual column.

We replace many of the dtuple_t accessors with const-preserving
inline functions, and fix missing or misleadingly applied const
qualifiers accordingly.
2019-05-03 20:02:50 +03:00
Marko Mäkelä
3db94d2403 MDEV-19346: Remove dummy InnoDB log checkpoints
log_checkpoint(), log_make_checkpoint_at(): Remove the parameter
write_always. It seems that the primary purpose of this parameter
was to ensure in the function recv_reset_logs() that both checkpoint
header pages will be overwritten, when the function is called from
the never-enabled function recv_recovery_from_archive_start().

create_log_files(): Merge recv_reset_logs() to its only caller.

Debug instrumentation: Prefer to flush the redo log, instead of
triggering a redo log checkpoint.

page_header_set_field(): Disable a debug assertion that will
always fail due to MDEV-19344, now that we no longer initiate
a redo log checkpoint before an injected crash.

In recv_reset_logs() there used to be two calls to
log_make_checkpoint_at(). The apparent purpose of this was
to ensure that both InnoDB redo log checkpoint header pages
will be initialized or overwritten.
The second call was removed (without any explanation) in MySQL 5.6.3:
mysql/mysql-server@4ca37968da

In MySQL 5.6.8 WL#6494, starting with
mysql/mysql-server@00a0ba8ad9
the function recv_reset_logs() was not only invoked during
InnoDB data file initialization, but also during a regular
startup when the redo log is being resized.

mysql/mysql-server@45e9167983
in MySQL 5.7.2 removed the UNIV_LOG_ARCHIVE code, but still
did not remove the parameter write_always.
2019-05-03 20:02:11 +03:00
Thirunarayanan Balathandayuthapani
ada1074bb1 MDEV-14398 innodb_encryption_rotate_key_age=0 causes innodb_encrypt_tables to be ignored
The statement

SET GLOBAL innodb_encryption_rotate_key_age=0;

would have the unwanted side effect that ENCRYPTION=DEFAULT tablespaces
would no longer be encrypted or decrypted according to the setting of
innodb_encrypt_tables.

We implement a trigger, so that whenever one of the following is executed:

SET GLOBAL innodb_encrypt_tables=OFF;
SET GLOBAL innodb_encrypt_tables=ON;
SET GLOBAL innodb_encrypt_tables=FORCE;

all wrong-state ENCRYPTION=DEFAULT tablespaces will be added to
fil_system_t::rotation_list, so that the encryption will be added
or removed.

Note: This will *NOT* happen automatically after a server restart.
Before reading the first page of a data file, InnoDB cannot know
the encryption status of the data file. The statement
SET GLOBAL innodb_encrypt_tables will have the side effect that
all not-yet-read InnoDB data files will be accessed in order to
determine the encryption status.

innodb_encrypt_tables_validate(): Stop disallowing
SET GLOBAL innodb_encrypt_tables when innodb_encryption_rotate_key_age=0.
This reverts part of commit 50eb40a2a8
that addressed MDEV-11738 and MDEV-11581.

fil_system_t::read_page0(): Trigger a call to fil_node_t::read_page0().
Refactored from fil_space_get_space().

fil_crypt_rotation_list_fill(): If innodb_encryption_rotate_key_age=0,
initialize fil_system->rotation_list. This is invoked both on
SET GLOBAL innodb_encrypt_tables and
on SET GLOBAL innodb_encryption_rotate_key_age=0.

fil_space_set_crypt_data(): Remove.

fil_parse_write_crypt_data(): Simplify the logic.

This is joint work with Marko Mäkelä.
2019-05-02 13:31:59 +03:00
Marko Mäkelä
810f014ca7 MDEV-18429: Simpler implementation
This reverts commit 61f370a3c9
and implements a simpler fix that is straightforward to merge to 10.3.

lock_print_info: Renamed from PrintNotStarted. Dump the
entire contents of trx_sys->mysql_trx_list.

lock_print_info_rw_recovered: Like lock_print_info, but dump
only recovered transactions in trx_sys->rw_trx_list.

lock_print_info_all_transactions(): Dump both trx_sys->mysql_trx_list
and trx_sys->rw_trx_list.

TrxLockIterator, TrxListIterator, lock_rec_fetch_page(): Remove.
This is a partial backport of the 10.3
commit a447980ff3
which removed the race-condition-prone ability of the InnoDB monitor
to read relevant pages into the buffer pool for some record locks.
2019-04-29 17:49:55 +03:00
Marko Mäkelä
092602ac9b MDEV-14130 InnoDB messages should not refer to the MySQL 5.7 manual
In InnoDB error messages, replace the hyperlink URLs to point
to the MariaDB knowledge base.
2019-04-29 15:11:06 +03:00
Marko Mäkelä
5fb4c0a8a8 Clean up ut_list
ut_list_validate(), ut_list_map(): Add variants with const Functor&
so that these functions can be called with an rvalue.

Remove wrapper macros, and add #ifdef UNIV_DEBUG around debug-only code.
2019-04-29 15:11:06 +03:00
Eugene Kosov
61f370a3c9 MDEV-18429 Consistent non-locking reads do not appear in TRANSACTIONS section of SHOW ENGINE INNODB STATUS
lock_print_info_all_transactions(): print transactions which are started
but not in RW or RO lists.

print_not_started(): Replaces PrintNotStarted. Collect the skipped
transactions.
2019-04-29 15:11:06 +03:00
Marko Mäkelä
1b577e4d8b MDEV-19356 Assertion 'space->free_limit == 0 || space->free_limit == free_limit'
fil_node_t::read_page0(): Do not replace up-to-date metadata with a
possibly old version of page 0 that is being reread from the file.
A more up-to-date page 0 could still exists in the buffer pool,
waiting to be written back to the file.
2019-04-29 11:43:22 +03:00
Marko Mäkelä
e10b3fa97a MDEV-19231: Correct an assertion
BtrBulk::finish(): On error, the B-tree can be corrupted. Only upon
successful completion, it makes sense to validate the index.
2019-04-29 10:04:54 +03:00
Marko Mäkelä
1a5ba2a4be MDEV-19342 Merge new release of InnoDB 5.7.26 to 10.2 2019-04-26 18:19:50 +03:00
Marko Mäkelä
4e9f8c9cc4 Remove roll_node_t::partial
The field roll_node_t::partial holds if and only if
savept has been set. Make savept a pointer.

trx_rollback_start(): Use the semantic type undo_no_t for roll_limit.
2019-04-26 17:40:20 +03:00
Aditya A
f3a9f12bc3 Bug #29021730 CRASHING INNOBASE_COL_CHECK_FK WITH FOREIGN KEYS
PROBLEM
-------
Function innodb_base_col_setup_for_stored() was skipping to store
the base column information for a generated column if the base column
was a "STORED" generated column. This later causes a crash in function
innoabse_col_check_fk() where it says that a generated columns depends
upon two base columns ,but there is information on only one of them.
There was a explicit check barring the stored columns being stored,
which is wrong because the documentation says that a generated stored
column can be a part of a generated column.

FIX
----
Store the information of base column if it is a stored generated column.

#RB21247
Reviewed by: Debarun Banerjee <debarun.banerjee@oracle.com>
2019-04-26 17:40:20 +03:00
Marko Mäkelä
793bd3ee13 lock_rec_convert_impl_to_expl_for_trx(): Remove unused parameter 2019-04-26 17:40:20 +03:00
Sachin Agarwal
06ec56f579 Bug #27850600 INNODB ASYNC IO ERROR HANDLING IN IO_EVENT
Problem:
io_getevents() - read asynchronous I/O events from the completion
queue. For each IO event, the res field in io_event tells whether IO
event is succeeded or not. To see if the IO actually succeeded we
always need to check event.res (negative=error,
positive=bytesread/written).
LinuxAIOHandler::collect() doesn't check event.res value for each event.
which leads to incorrect value in n_bytes for IO context (or IO Slot).

Fix:
Added a check for event.res negative value.

RB: 20871
Reviewed by : annamalai.gurusami@oracle.com
2019-04-26 17:40:20 +03:00
Marko Mäkelä
1c4d1f3d05 innobase_col_check_fk(): Remove copying 2019-04-26 17:40:19 +03:00
Marko Mäkelä
b2dbc781c7 Implement --debug=d,ib_log_checkpoint_avoid
Normally, the InnoDB master thread executes InnoDB log checkpoints
so frequently that bugs in crash recovery or redo logging can be
hard to reproduce. This is because crash recovery would start replaying
the log only from the latest checkpoint. Because the InnoDB redo log
format only allows saving information for at most 2 latest checkpoints,
and because the log files are written in a circular fashion, it would
be challenging to implement a debug option that would start the redo
log apply from the very start of the redo log file.
2019-04-25 17:26:23 +03:00
Eugene Kosov
6c5c1f0b2f MDEV-19231 make DB_SUCCESS equal to 0
It's a micro optimization. On most platforms CPUs has instructions to
compare with 0 fast. DB_SUCCESS is the most popular outcome of functions
and this patch optimized code like (err == DB_SUCCESS)

BtrBulk::finish(): bogus assertion fixed

fil_node_t::read_page0(): corrected usage of os_file_read()

que_eval_sql(): bugus assertion removed. Apparently it checked that
the field was assigned after having been zero-initialized at
object creation.

It turns out that the return type of os_file_read_func() was changed
in mysql/mysql-server@98909cefbc (MySQL 5.7)
from ibool to dberr_t. The reviewer (if there was any) failed to
point out that because of future merges, it could be a bad idea to
change the return type of a function without changing the function name.

This change was applied to MariaDB 10.2.2 in
commit 2e814d4702 but the
MariaDB-specific code was not fully adjusted accordingly,
e.g. in fil_node_open_file(). Essentially, code like
!os_file_read(...) became dead code in MariaDB and later
in Mariabackup 10.2, and we could be dealing with an uninitialized
buffer after a failed page read.
2019-04-25 16:29:55 +03:00
Marko Mäkelä
caa9023c9e MDEV-19331 Merge new release of InnoDB 5.6.44 to 10.1 2019-04-25 14:15:54 +03:00
Marko Mäkelä
1cd31bc132 Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM
For partitioned table, ensure that the AUTO_INCREMENT values will
be assigned from the same sequence. This is based on the following
change in MySQL 5.6.44:

commit aaba359c13d9200747a609730dafafc3b63cd4d6
Author: Rahul Malik <rahul.m.malik@oracle.com>
Date:   Mon Feb 4 13:31:41 2019 +0530

    Bug#28573894 ALTER PARTITIONED TABLE ADD AUTO_INCREMENT DIFF RESULT DEPENDING ON ALGORITHM

    Problem:
    When a partition table is in-place altered to add an auto-increment column,
    then its values are starting over for each partition.

    Analysis:
    In the case of in-place alter, InnoDB is creating a new sequence object
    for each partition. It is default initialized. So auto-increment columns
    start over for each partition.

    Fix:
    Assign old sequence of the partition to the sequence of next partition
    so it won't start over.

    RB#21148
    Reviewed by Bin Su <bin.x.su@oracle.com>
2019-04-25 14:12:45 +03:00
Marko Mäkelä
9e7bcb05d4 Clean up ib_sequence::m_max_value
Correctly document the usage of m_max_value. Remove the const
qualifier, so that the implicit assignment operator can be used.
Make all members of ib_sequence private, and add an accessor
member function max_value().
2019-04-25 14:12:45 +03:00
Aakanksha Verma
3ae2198483 Bug #19811005 ALTER TABLE ADD INDEX DOES NOT UPDATE INDEX_LENGTH IN I_S TABLES
PROBLEM
 =======
An add index doesn't update index length stats in information schema
TABLES table.

FIX
 ===
Update the dict_table_t variable with index length stats that is
actually calculated post alter . As this variable is used to populated
the information schema index length statistics.

Reviewed by: Bin su<bin.x.su@oracle.com>
RB: 21277
2019-04-25 14:12:45 +03:00
Marko Mäkelä
bc145193c1 Merge 10.1 into 10.2 2019-04-25 09:04:09 +03:00
Marko Mäkelä
bfb0726fc2 Merge 5.5 into 10.1 2019-04-24 12:03:11 +03:00
Thirunarayanan Balathandayuthapani
d5da8ae04d MDEV-15772 Potential list overrun during XA recovery
InnoDB could return the same list again and again if the buffer
passed to trx_recover_for_mysql() is smaller than the number of
transactions that InnoDB recovered in XA PREPARE state.

We introduce the transaction state TRX_PREPARED_RECOVERED, which
is like TRX_PREPARED, but will be set during trx_recover_for_mysql()
so that each transaction will only be returned once.

Because init_server_components() is invoking ha_recover() twice,
we must reset the state of the transactions back to TRX_PREPARED
after returning the complete list, so that repeated traversals
will see the complete list again, instead of seeing an empty list.
Without this tweak, the test main.tc_heuristic_recover would hang
in MariaDB 10.1.
2019-04-24 11:46:14 +03:00
Marko Mäkelä
e5aa8ea525 MDEV-18139 ALTER IGNORE ... ADD FOREIGN KEY causes bogus error
dict_create_foreign_constraints_low(): Tolerate the keywords
IGNORE and ONLINE between the keywords ALTER and TABLE.

We should really remove the hacky FOREIGN KEY constraint parser
from InnoDB.
2019-04-23 17:56:43 +03:00
Marko Mäkelä
d315b4ff39 Remove IBUF_COUNT_DEBUG
The compile-time option IBUF_COUNT_DEBUG has not been used for years.
It would only work with up to 3 created .ibd files, with no buffered
changes existing while InnoDB is started up.
2019-04-19 12:44:46 +03:00
Marko Mäkelä
169c00994b MDEV-12699 Improve crash recovery of corrupted data pages
InnoDB crash recovery used to read every data page for which
redo log exists. This is unnecessary for those pages that are
initialized by the redo log. If a newly created page is corrupted,
recovery could unnecessarily fail. It would suffice to reinitialize
the page based on the redo log records.

To add insult to injury, InnoDB crash recovery could hang if it
encountered a corrupted page. We will fix also that problem.
InnoDB would normally refuse to start up if it encounters a
corrupted page on recovery, but that can be overridden by
setting innodb_force_recovery=1.

Data pages are completely initialized by the records
MLOG_INIT_FILE_PAGE2 and MLOG_ZIP_PAGE_COMPRESS.
MariaDB 10.4 additionally recognizes MLOG_INIT_FREE_PAGE,
which notifies that a page has been freed and its contents
can be discarded (filled with zeroes).

The record MLOG_INDEX_LOAD notifies that redo logging has
been re-enabled after being disabled. We can avoid loading
the page if all buffered redo log records predate the
MLOG_INDEX_LOAD record.

For the internal tables of FULLTEXT INDEX, no MLOG_INDEX_LOAD
records were written before commit aa3f7a107c.
Hence, we will skip these optimizations for tables whose
name starts with FTS_.

This is joint work with Thirunarayanan Balathandayuthapani.

fil_space_t::enable_lsn, file_name_t::enable_lsn: The LSN of the
latest recovered MLOG_INDEX_LOAD record for a tablespace.

mlog_init: Page initialization operations discovered during
redo log scanning. FIXME: This really belongs in recv_sys->addr_hash,
and should be removed in MDEV-19176.

recv_addr_state: Add the new state RECV_WILL_NOT_READ to
indicate that according to mlog_init, the page will be
initialized based on redo log record contents.

recv_add_to_hash_table(): Set the RECV_WILL_NOT_READ state
if appropriate. For now, we do not treat MLOG_ZIP_PAGE_COMPRESS
as page initialization. This works around bugs in the crash
recovery of ROW_FORMAT=COMPRESSED tables.

recv_mark_log_index_load(): Process a MLOG_INDEX_LOAD record
by resetting the state to RECV_NOT_PROCESSED and by updating
the fil_name_t::enable_lsn.

recv_init_crash_recovery_spaces(): Copy fil_name_t::enable_lsn
to fil_space_t::enable_lsn.

recv_recover_page(): Add the parameter init_lsn, to ignore
any log records that precede the page initialization.
Add DBUG output about skipped operations.

buf_page_create(): Initialize FIL_PAGE_LSN, so that
recv_recover_page() will not wrongly skip applying
the page-initialization record due to the field containing
some newer LSN as a leftover from a different page.
Do not invoke ibuf_merge_or_delete_for_page() during
crash recovery.

recv_apply_hashed_log_recs(): Remove some unnecessary lookups.
Note if a corrupted page was found during recovery.
After invoking buf_page_create(), do invoke
ibuf_merge_or_delete_for_page() via mlog_init.ibuf_merge()
in the last recovery batch.

ibuf_merge_or_delete_for_page(): Relax a debug assertion.

innobase_start_or_create_for_mysql(): Abort startup if
a corrupted page was found during recovery. Corrupted pages
will not be flagged if innodb_force_recovery is set.
However, the recv_sys->found_corrupt_fs flag can be set
regardless of innodb_force_recovery if file names are found
to be incorrect (for example, multiple files with the same
tablespace ID).
2019-04-17 13:58:41 +03:00
Marko Mäkelä
376bf4ede5 MDEV-19241 InnoDB fails to write MLOG_INDEX_LOAD upon completing ALTER TABLE
Similar to what was done in commit aa3f7a107c
for FULLTEXT INDEX, we must ensure that MLOG_INDEX_LOAD records will always
be written if redo logging was disabled.

row_merge_build_indexes(): Invoke row_merge_write_redo() also when
online operation is not being executed or an error occurs.
In case of an error, invoke flush_observer->interrupted() so that
the pages will not be flushed but merely evicted from the buffer pool.
Before resuming redo logging, it is crucial for the correctness of
mariabackup and InnoDB crash recovery to flush or evict all affected pages
and to write MLOG_INDEX_LOAD records.
2019-04-17 13:58:22 +03:00
Marko Mäkelä
bc8d173b9f MDEV-14239 Missing space: "innodb_open_files ... greaterthan"
innobase_init(): Add a missing space to a warning message.
Apparently, this message was corrupted in MariaDB 10.2.2 in
commit fec844aca8 related to a
conflict resolution when applying a change from MySQL 5.7.12.
2019-04-15 19:17:24 +03:00
Marko Mäkelä
4ac8fa008d FSP_FLAGS_MEM_MASK: Remove traces of ATOMIC_WRITES 2019-04-10 16:21:57 +03:00
Marko Mäkelä
e7f426d2c9 MDEV-19212: Replace macros with type-safe inline functions
The regression that was reported in MDEV-19212 occurred due to use
of macros that did not ensure that the arguments have compatible
types.

ut_2pow_remainder(), ut_2pow_round(), ut_calc_align(): Define as
inline function templates.

UT_CALC_ALIGN(): Define as a macro, because this is used in
compile_time_assert(). Only starting with C++11 (MariaDB 10.4)
we could define the inline functions as constexpr.
2019-04-08 21:33:49 +03:00
Marko Mäkelä
f120a15b93 MDEV-19212 4GB Limit on large_pages - integer overflow
os_mem_alloc_large(): Invoke the macro ut_2pow_round() with the
correct argument type.

innobase_large_page_size, innobase_use_large_pages,
os_use_large_pages, os_large_page_size: Remove.
Simply refer to opt_large_page_size, my_use_large_pages.
2019-04-08 21:33:49 +03:00
Marko Mäkelä
4b822111ef MDEV-8139: Clean up the freeing of B-tree pages
btr_page_free(): Renamed from btr_page_free_low().
If scrubbing is enabled, zero out the page with proper redo logging.
Only pass ahi=true to fseg_free_page() if the page is actually indexed.

fil_space_t::modify_check(): Renamed from fsp_space_modify_check().

fsp_init_file_page(): Define inline.
2019-04-08 21:33:49 +03:00
Marko Mäkelä
7f5849a809 MDEV-18309: Remove unused code 2019-04-07 12:05:12 +03:00
Marko Mäkelä
867617a976 MDEV-18309: InnoDB reports bogus errors about missing #sql-*.ibd on startup
This is a follow-up to MDEV-18733. As part of that fix, we made
dict_check_sys_tables() skip tables that would be dropped by
row_mysql_drop_garbage_tables().

DICT_ERR_IGNORE_DROP: A new mode where the file should not be attempted
to be opened.

dict_load_tablespace(): Do not try to load the tablespace if
DICT_ERR_IGNORE_DROP has been specified.

row_mysql_drop_garbage_tables(): Pass the DICT_ERR_IGNORE_DROP mode.

fil_space_for_table_exists_in_mem(): Remove a parameter.
The only caller that passed print_error_if_does_not_exist=true
was row_drop_single_table_tablespace().
2019-04-07 10:57:38 +03:00
Marko Mäkelä
1d30b7b1d2 MDEV-12699 preparation: Clean up recv_sys
The recv_sys data structures are accessed not only from the thread
that executes InnoDB plugin initialization, but also from the
InnoDB I/O threads, which can invoke recv_recover_page().

Assert that sufficient concurrency control is in place.
Some code was accessing recv_sys data structures without
holding recv_sys->mutex.

recv_recover_page(bpage): Refactor the call from buf_page_io_complete()
into a separate function that performs necessary steps. The
main thread was unnecessarily releasing and reacquiring recv_sys->mutex.

recv_recover_page(block,mtr,recv_addr): Pass more parameters from
the caller. Avoid redundant lookups and computations. Eliminate some
redundant variables.

recv_get_fil_addr_struct(): Assert that recv_sys->mutex is being held.
That was not always the case!

recv_scan_log_recs(): Acquire recv_sys->mutex for the whole duration
of the function. (While we are scanning and buffering redo log records,
no pages can be read in.)

recv_read_in_area(): Properly protect access with recv_sys->mutex.

recv_apply_hashed_log_recs(): Check recv_addr->state only once,
and continuously hold recv_sys->mutex. The mutex will be released
and reacquired inside recv_recover_page() and recv_read_in_area(),
allowing concurrent processing by buf_page_io_complete() in I/O threads.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
aa3f7a107c MDEV-12699 preparation: Write MLOG_INDEX_LOAD for FTS_ tables
The record MLOG_INDEX_LOAD is supposed to be written to indicate that
some page modifications bypassed redo logging, and that redo logging
is now re-enabled. It was not written for fulltext indexes during
ALTER TABLE.

row_merge_write_redo(): Declare globally. Assert that the index
is neither a spatial nor fulltext index.

recv_mlog_index_load(): Observe a MLOG_INDEX_LOAD operation.

recv_parse_log_recs(): Handle MLOG_INDEX_LOAD also in multi-record
mini-transactions. Because of this omission, we should keep writing
MLOG_INDEX_LOAD in single-record mini-transactions, because older
versions of Mariabackup would fail.

row_fts_merge_insert(): Write MLOG_INDEX_LOAD for the auxiliary
tables of fulltext indexes.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
45d338dca8 MDEV-12699 preparation: Initialize the entire page on MLOG_ZIP_PAGE_COMPRESS
The record MLOG_ZIP_PAGE_COMPRESS is similar to MLOG_INIT_FILE_PAGE2
that it contains all the information needed to initialize the page.
Like for the other record, do initialize the entire page on recovery.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
1b95118c5f buf_page_get_gen(): Allow BUF_GET_IF_IN_POOL with a dummy page_size
The page_size argument to buf_page_get_gen() only matters when the
page is going to be loaded into the buffer pool. Allow callers to
pass a dummy parameter when using BUF_GET_IF_IN_POOL (which would
return NULL if the block is not in the buffer pool).
2019-04-06 21:25:43 +03:00
Marko Mäkelä
80f29211eb Fix a crash in CHECK TABLE for corrupted encrypted root page
btr_root_get(): Ignore the root->page.encrypted flag.
The purpose of this flag is questionable since
commit 8c43f96388.

btr_validate_index(): Avoid crash if btr_root_get() returns NULL.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
1d0380e029 MDEV-15528 preparation: Do not modify a freed page
btr_free_root(): Add the parameter bool invalidate.

btr_free_root_invalidate(): Remove.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
56df18be65 Clean up the parsing of MLOG_INIT_FILE_PAGE2
fsp_apply_init_file_page(): Renamed from fsp_init_file_page_low().

fsp_parse_init_file_page(): Remove. The redo log record has no
parameters.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
71f9552fd8 recv_recovery_is_on(): Add UNIV_UNLIKELY
Normally, InnoDB is not in the process of executing crash recovery.
Provide a hint to the compiler that the recovery-related code paths
are rarely executed.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
f602385776 Do not pass table_name_t to printf-like functions 2019-04-04 08:57:53 +03:00
Marko Mäkelä
c676de1692 Fix the non-debug build 2019-04-03 21:00:13 +03:00
Marko Mäkelä
28636a92ed Merge 10.1 into 10.2 2019-04-03 19:58:47 +03:00
Marko Mäkelä
cad56fbaba MDEV-18733 MariaDB slow start after crash recovery
If InnoDB crash recovery was needed, the InnoDB function srv_start()
would invoke extra validation, reading something from every InnoDB
data file. This should be unnecessary now that MDEV-14717 made
RENAME operations crash-safe inside InnoDB (which can be
disabled in MariaDB 10.2 by setting innodb_safe_truncate=OFF).

dict_check_sys_tables(): Skip tables that would be dropped by
row_mysql_drop_garbage_tables(). Perform extra validation only
if innodb_safe_truncate=OFF, innodb_force_recovery=0 and
crash recovery was needed.

dict_load_table_one(): Validate the root page of the table.
In this way, we can deny access to corrupted or mismatching tables
not only after crash recovery, but also after a clean shutdown.
2019-04-03 19:56:03 +03:00
Marko Mäkelä
7984ea80de Remove a useless CHECK TABLE printout for debug builds 2019-04-03 19:56:03 +03:00
Marko Mäkelä
fa14784301 Fix more -Wnonnull-compare
For some reason, GCC 8 did not issue warnings for all such comparisons.
2019-04-03 19:21:54 +03:00
Marko Mäkelä
a1ec7ac4f4 Clean up table_name_t
row_is_mysql_tmp_table_name(): Replaced with
dict_table_t::is_temporary_name() and table_name_t::is_temporary().

table_name_t: Add constructors.
2019-04-03 16:09:16 +03:00
Marko Mäkelä
03672a0573 MDEV-11487: Remove dict_table_get_n_sys_cols()
In MariaDB, InnoDB tables will always contain DATA_N_SYS_COLS = 3
columns, 2 or 3 of which are present in the clustered index.
We remove the predicate that was added in MySQL 5.7 as part of WL#7682.
2019-04-03 11:02:55 +03:00
Marko Mäkelä
dbc716675b Merge 10.1 into 10.2 2019-04-03 10:32:21 +03:00
Marko Mäkelä
c0fca2863b Fix -Wnonnull-compare
InnoDB and XtraDB had redundant assertions for checking that
function parameters that were declared as nonnull were not NULL.
2019-04-03 09:46:49 +03:00
Marko Mäkelä
e3f44d8d0e MDEV-19085: Remove a bogus debug assertion
MariaDB does support InnoDB tables with no stored columns.
(They are necessarily empty.)
2019-04-02 16:40:27 +03:00
Marko Mäkelä
5633f83ca4 Fix integer type mismatch 2019-04-02 13:46:36 +03:00
Marko Mäkelä
8650848ec3 MDEV-19128 fil_name_parse() for MLOG_FILE_ is not portable
On Microsoft Windows, InnoDB writes the path separator \ to the
redo log file, while on all other platforms, / is being used.

fil_name_parse(): Normalize the parsed path separators to the
native format. This allows backups or data sets to be portable
between Windows and other systems.
2019-04-02 13:43:22 +03:00
Marko Mäkelä
bce380f2a5 Merge 10.1 into 10.2 2019-04-02 09:14:15 +03:00
Marko Mäkelä
b88f378648 Omit the definition of unused function yyset_extra()
This is follow-up for commit 619d22dde5
to fix the cmake -DWITH_EMBEDDED_SERVER build.
2019-04-02 08:50:53 +03:00
Marko Mäkelä
d59ad6972b MDEV-19085: Fix a typo that was caught by GCC 5.4 2019-04-01 18:13:11 +03:00
Marko Mäkelä
f055da9b84 MDEV-19085 Assertion failures due to virtual columns after upgrading from 10.1
MariaDB before MDEV-5800 in version 10.2.2 did not support
indexed virtual columns. Non-persistent virtual columns were
hidden from storage engines. Only starting with MDEV-5800, InnoDB
would create internal metadata on virtual columns.

Similar to what was done in MDEV-18084, MDEV-18090, MDEV-18960, we adjust
one more code path for the old tables.

innobase_build_col_map(): Allocate space for virtual columns in col_map[]
but leave the entries at ULINT_UNDEFINED, noting that the virtual columns
were missing before the table was being rebuilt.
2019-04-01 14:24:15 +03:00
Marko Mäkelä
619d22dde5 Rebuild the InnoDB lexical analyzers with flex 2.6.4
InnoDB includes 3 parsers, which use 3 lexical analyzers that
are generated with flex. Flex versions before 2.6 emitted
the keyword "register", which is deprecated in C++17.

The lexical analyzers were regenerated as follows:

for s in storage/innobase storage/xtradb
do
	(cd "$s"/pars; ./make_flex.sh)
	touch "$s"/fts/*.l
	make -C "$s"/fts -f Makefile.query
done
2019-04-01 13:03:18 +03:00
Marko Mäkelä
23eeecd68f MDEV-19111 Unused field INFORMATION_SCHEMA.INNODB_TABLESPACES_SCRUBBING.ROTATING_OR_FLUSHING
The MDEV-11738/MDEV-11581 fix was supposed to add the column
ROTATING_OR_FLUSHING to the INFORMATION_SCHEMA table
INNODB_TABLESPACES_ENCRYPTION, but it also added that column to
INNODB_TABLESPACES_SCRUBBING in InnoDB (not XtraDB).

The extra column was never initialized. We will remove it,
because key rotation has nothing to do with the scrubbing of
tablespace data.
2019-04-01 10:32:03 +03:00
Eugene Kosov
76934212eb remove unneeded code
rec_get_offsets() was previously called in a btr_cur_optimistic_update()
2019-03-29 23:42:26 +04:00
Eugene Kosov
b66164ab56 remove dead code 2019-03-29 23:40:28 +04:00
Sergei Golubchik
cc71e7501c post-merge: -Werror fixes in 10.2 2019-03-29 10:58:25 +01:00
Marko Mäkelä
d0116e10a5 Revert MDEV-18464 and MDEV-12009
This reverts commit 21b2fada7a
and commit 81d71ee6b2.

The MDEV-18464 change introduces a few data race issues. Contrary to
the documentation, the field trx_t::victim is not always being protected
by lock_sys_t::mutex and trx_t::mutex. Most importantly, it seems
that KILL QUERY could wrongly avoid acquiring both mutexes when
invoking lock_trx_handle_wait_low(), in case another thread had
already set trx->victim=true.

We also revert MDEV-12009, because it should depend on the MDEV-18464
fix being present.
2019-03-28 12:39:50 +02:00
Thirunarayanan Balathandayuthapani
0623cc7c16 MDEV-19051 Avoid unnecessary writing MLOG_INDEX_LOAD
1) Avoid writing of MLOG_INDEX_LOAD redo log record during inplace
alter table when the table is empty and also for spatial index.

2) Avoid creation of temporary merge file for spatial index during
index creation process.
2019-03-28 10:55:18 +02:00
Jan Lindström
21b2fada7a MDEV-18464: Port kill_one_trx fixes from 10.4 to 10.1
Pushed the decision for innodb transaction and system
locking down to lock0lock.cc level. With this,
we can avoid releasing these mutexes for executions
where these mutexes were acquired upfront.

This patch will also fix BF aborting of native threads, e.g.
threads which have declared wsrep_on=OFF. Earlier, we have
used, for innodb trx locks, was_chosen_as_deadlock_victim
flag, for marking inodb transactions, which are victims for
wsrep BF abort. With native threads (wsrep_on==OFF), re-using
was_chosen_as_deadlock_victim flag may lead to inteference
with real deadlock, and to deal with this, the patch has added new
flag for marking wsrep BF aborts only: victim=true

Similar way if replication decides to abort one of the threads
we mark victim by: victim=true

innobase_kill_query
	Remove lock sys and trx mutex handling.

wsrep_innobase_kill_one_trx
	Mark victim trx with victim=true

trx0trx.h
	Remove trx_abort_t type and abort type variable from
	trx struct. Add victim variable to trx.

wsrep_kill_victim
	Remove abort_type

lock_report_waiters_to_mysql
	Take also trx mutex and mark trx as a victim for
	replication abort.

lock_trx_handle_wait_low
	New low level function to check whether the transaction
	has already been rolled back because it was selected as
	a deadlock victim, or if it has to wait then cancel
	the wait lock.

lock_trx_handle_wait
	If transaction is not marked as victim take lock sys
	and trx mutex before calling lock_trx_handle_wait_low
	and release them after that.

row_search_for_mysql
	Remove lock sys and trx mutex taking and releasing.

trx_rollback_to_savepoint_for_mysql_low
trx_commit_in_memory
	Clean up victim variable.
2019-03-28 07:40:03 +02:00
Sergei Golubchik
f97d879bf8 cmake: re-enable -Werror in the maintainer mode
now we can afford it. Fix -Werror errors. Note:
* old gcc is bad at detecting uninit variables, disable it.
* time_t is int or long, cast it for printf's
2019-03-27 22:51:37 +01:00
Marko Mäkelä
1e9c2b2305 Merge 10.1 into 10.2 2019-03-27 12:26:11 +02:00
Marko Mäkelä
a6585d5ce9 Merge 10.0 into 10.1 2019-03-27 11:56:08 +02:00
Marko Mäkelä
828cc2ba7c MDEV-18417/MDEV-18656/MDEV-18417: Work around compiler ASAN bug
In a Ubuntu Xenial build environment, the compiler identified as
g++-5.real (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609
seems to be emitting incorrect code for the compilation unit
trx0rec.cc, triggering a bogus-looking AddressSanitizer report
of an invalid read of something in the function trx_undo_rec_get_pars().
This is potentially affecting any larger tests where the InnoDB
purge subsystem is being exercised.

When the optimization level of trx0rec.cc is limited to -O1, no
bogus failure is being reported. With -O2 or -O3, a lot of things
seemed to be inlined in the function, and the disassembly of the
generated code did not make sense to me.
2019-03-27 11:54:34 +02:00
Marko Mäkelä
226ca250ed Merge 10.1 into 10.2 2019-03-26 14:17:19 +02:00
Marko Mäkelä
065ba53ccb MDEV-12711 mariabackup --backup is refused for multi-file system tablespace
Before MDEV-12113 (MariaDB Server 10.1.25), on shutdown InnoDB would write
the current LSN to the first page of each file of the system tablespace.
This is incompatible with MariaDB's InnoDB table encryption, because
encryption repurposed the field for an encryption key ID and checksum.

buf_page_is_corrupted(): For the InnoDB system tablespace, skip
FIL_PAGE_FILE_FLUSH_LSN when checking if a page is all zero,
because the first page of each file in the system tablespace can
contain nonzero bytes in the field.
2019-03-26 13:51:15 +02:00
Marko Mäkelä
07096ada9b Fix the Windows build
btr_cur_compress_recommendation(): Backport a change from 10.3.

This is a follow-up to commit 1bd9815479.
2019-03-25 17:15:34 +02:00
Marko Mäkelä
525e79b057 MDEV-19022: InnoDB fails to cleanup useless B-tree pages
The test case for reproducing MDEV-14126 demonstrates that InnoDB can
end up with an index tree where a non-leaf page has only one child page.

The test case innodb.innodb_bug14676111 demonstrates that such pages
are sometimes unavoidable, because InnoDB does not implement any sort
of B-tree rotation.

But, there is no reason to allow a root page with only one child page.

btr_cur_node_ptr_delete(): Replaces btr_node_ptr_delete().

btr_page_get_father(): Declare globally.

btr_discard_only_page_on_level(): Declare with ATTRIBUTE_COLD.
It turns out that this function is not covered by the
innodb.innodb_bug14676111 test case after all.

btr_discard_page(): If the root page ends up having only one child
page, shrink the tree by invoking btr_lift_page_up().
2019-03-25 16:03:24 +02:00
Marko Mäkelä
ade0a0e9d5 Avoid sign mismatch in comparisons
This is follow-up to commit 1bd9815479.
2019-03-25 16:00:06 +02:00
Marko Mäkelä
1bd9815479 MDEV-14126: Fix type mismatch
Backport some changes to B-tree page accessor functions from 10.3,
including changing page_get_n_recs() to return uint16_t.
2019-03-25 11:27:29 +02:00
Marko Mäkelä
72b934e3f7 MDEV-14126: Detect unexpected emptying of B-tree pages
If an index page becomes empty, btr_page_empty() should be called.
2019-03-25 10:53:01 +02:00
Marko Mäkelä
a4d0d6828b MDEV-14126: Improve assertions in btr_pcur_store_position() 2019-03-25 10:53:00 +02:00
Marko Mäkelä
b59d484696 MDEV-14126: Remove page_is_root()
The predicate page_is_root(), which was added in MariaDB Server 10.2.2,
is based on a wrong assumption.

Under some circumstances, InnoDB can transform B-trees into a degenerate
state where a non-leaf page has no sibling pages. Because of this,
we cannot assume that a page that has no siblings is the root page.
This bug will be tracked as MDEV-19022.

Because of the bug that may affect many InnoDB data files, we must remove
and replace the wrong predicate. Using the wrong predicate can cause
corruption. A leaf page is not allowed to be empty except if it is the
root page, and the entire table is empty.
2019-03-25 10:53:00 +02:00
Marko Mäkelä
71c781bfe7 MDEV-18090 Assertion failures due to virtual columns after upgrading to 10.2
MariaDB before MDEV-5800 in version 10.2.2 did not support
indexed virtual columns. Non-persistent virtual columns were
hidden from storage engines. Only starting with MDEV-5800, InnoDB
would create internal metadata on virtual columns.

Similar to what was done in MDEV-18084 and MDEV-18960, we adjust two more
code paths for the old tables.

ha_innobase::build_template(): Do not invoke
dict_index_contains_col_or_prefix() for virtual columns if InnoDB
does not store the metadata.

innobase_build_col_map(): Relax an assertion about the number of columns.

ha_innobase::omits_virtual_cols(): Renamed from omits_virtual_cols().
2019-03-25 10:29:48 +02:00
Chris Calender
c61e8a6597 Fix for MDEV-17449, typo in error message (#1146) 2019-03-24 21:24:28 +04:00
Marko Mäkelä
031fa8f1d2 Merge 10.1 into 10.2 2019-03-22 11:15:21 +02:00
Marko Mäkelä
8c493a910f Merge 10.0 into 10.1 2019-03-21 21:06:01 +02:00
Marko Mäkelä
5d454181a8 MDEV-6262 follow-up: Ensure NUL termination on strncpy() 2019-03-21 10:29:59 +02:00
Marko Mäkelä
1caec9c898 Remove unnecessary trx_rsegf_get_new() calls
trx_rseg_header_create(): Return the block descriptor, and
remove the redundant trx_rsegf_get_new() call. Apparently
the idea of that call was some kind of encapsulation or
abstraction, to discourage the direct use of the constant TRX_RSEG.

This also removes the trx_purge_initiate_truncate() local
variable rseg_header, which was only used in debug builds.
2019-03-21 09:40:01 +02:00
Marko Mäkelä
5c1720f8af Remove unused variables from non-debug build 2019-03-21 09:30:37 +02:00
Marko Mäkelä
dcc4458aa0 Add appropriate #ifdef around a declaration 2019-03-21 09:30:03 +02:00
Marko Mäkelä
185a5000e6 Silence bogus -Wmaybe-uninitialized 2019-03-21 09:24:03 +02:00
Marko Mäkelä
630199e724 MDEV-18981 Possible corruption when using FOREIGN KEY with virtual columns
row_ins_foreign_fill_virtual(): Construct update->old_vrow
with ROW_COPY_DATA instead of ROW_COPY_POINTERS. With the latter,
the object would be pointing to a buffer pool page frame. That page
frame can become stale and invalid as soon as
row_ins_foreign_check_on_constraint() invokes mtr_t::commit().

Most of the time, the pointer target is not going to be overwritten
by anything, and everything appears to work correctly.
Buffer pool page replacement is highly unlikely, and any pessimistic
operation that would overwrite the old location of the record is only
slightly more likely. It is not known whether there is an actual bug.
This came up while diagnosing MDEV-18879 in MariaDB 10.3.
2019-03-20 18:34:49 +02:00
Marko Mäkelä
a77e266866 MDEV-18084: Crash on UPDATE after upgrade from 10.0 or 10.1
MariaDB Server 10.0 and 10.1 support non-indexed virtual columns,
which are hidden from the storage engine. Starting with MDEV-5800
in MariaDB 10.2.2, the virtual columns are visible to storage engines.

calc_row_difference(): Follow up the MDEV-17199 fix, which forgot
to increment num_v when skipping virtual columns in tables that
were created before MariaDB 10.2.2. This caused a corruption of
the update vector when an updated persistent column is preceded
by virtual columns.
2019-03-19 15:30:42 +02:00
Marko Mäkelä
1efda582ad Replace innobase_is_v_fld() with Field::stored_in_db()
The macro innobase_is_v_fld() turns out to be equivalent with
the opposite of Field::stored_in_db(). Remove the macro and
invoke the member function directly.

innodb_base_col_setup_for_stored(): Simplify a condition to only
check Field::vcol_info.

innobase_create_index_def(): Replace some redundant code with
DBUG_ASSERT().
2019-03-19 14:35:05 +02:00
Marko Mäkelä
9471dbafce MDEV-18960: Assertion !omits_virtual_cols(*form->s) on TRUNCATE
MariaDB before MDEV-5800 in version 10.2.2 did not support
indexed virtual columns. Non-persistent virtual columns were
hidden from storage engines. Only starting with MDEV-5800, InnoDB
would create internal metadata on virtual columns.

On TRUNCATE TABLE, an old .frm file from before MDEV-5800 may be
used as the table schema. When the table is being re-created by
InnoDB, the old schema must be used. That is, we may hide
the existence of virtual columns from InnoDB.

create_table_check_doc_id_col(): Remove the assertion that failed.
This function can actually correctly deal with virtual columns
that could have been created before MariaDB 10.2.2 introduced MDEV-5800.

create_table_info_t::create_table_def(): Do not create metadata for
virtual columns if the table definition was created before MariaDB 10.2.2.
2019-03-19 13:45:23 +02:00
Marko Mäkelä
00572a0b0c MDEV-17482 InnoDB fails to say which fatal error fsync() returned
os_file_fsync_posix(): If fsync() returns a fatal error,
do include errno in the error message.

In the future, we might handle fsync() or write or allocation failures
on InnoDB data files a little more gracefully: flag the affected index
or table as corrupted, and deny any subsequent writes to the table.

If a write to the undo log or redo log fails, an alternative to
killing the server could be to deny any writes to InnoDB tables
until the server has been restarted.
2019-03-18 12:32:10 +02:00
sysprg
26432e49d3 MDEV-17262: mysql crashed on galera while node rejoined cluster (#895)
This patch contains a fix for the MDEV-17262/17243 issues and
new mtr test.

These issues (MDEV-17262/17243) have two reasons:

1) After an intermediate commit, a transaction loses its status
of "transaction that registered in the MySQL for 2pc coordinator"
(in the InnoDB) due to the fact that since version 10.2 the
write_row() function (which located in the ha_innodb.cc) does
not call trx_register_for_2pc(m_prebuilt->trx) during the processing
of split transactions. It is necessary to restore this call inside
the write_row() when an intermediate commit was made (for a split
transaction).

Similarly, we need to set the flag of the started transaction
(m_prebuilt->sql_stat_start) after intermediate commit.

The table->file->extra(HA_EXTRA_FAKE_START_STMT) called from the
wsrep_load_data_split() function (which located in sql_load.cc)
will also do this, but it will be too late. As a result, the call
to the wsrep_append_keys() function from the InnoDB engine may be
lost or function may be called with invalid transaction identifier.

2) If a transaction with the LOAD DATA statement is divided into
logical mini-transactions (of the 10K rows) and binlog is rotated,
then in rare cases due to the wsrep handler re-registration at the
boundary of the split, the last portion of data may be lost. Since
splitting of the LOAD DATA into mini-transactions is technical,
I believe that we should not allow these mini-transactions to fall
into separate binlogs. Therefore, it is necessary to prohibit the
rotation of binlog in the middle of processing LOAD DATA statement.

https://jira.mariadb.org/browse/MDEV-17262 and
https://jira.mariadb.org/browse/MDEV-17243
2019-03-18 07:39:51 +02:00
Sergei Golubchik
f1134d5676 post-merge: gcc 8 warnings
note: Inherit String from Sql_alloc,
to get operators new and new[] in sync

in rocksdb gcc was complaining that non-lvalue was cast to const.
2019-03-15 21:00:50 +01:00
Sergei Golubchik
0508d327ae Merge branch '10.1' into 10.2 2019-03-15 21:00:41 +01:00
Marko Mäkelä
34db9958e2 MDEV-18936 Purge thread fails to exit on shutdown
When there is a huge transaction in the undo log, the purge threads
may get stuck in trx_purge_attach_undo_recs() for a long time,
causing the server to hang on a normal shutdown (innodb_fast_shutdown>0).

Apparently the innodb_purge_batch_size does not work correctly, or the
n_pages_handled is not being incremented correctly. We do not fix that
for now, but we will instead check if shutdown has been initiated,
allowing the purge threads to shut down without delays.
2019-03-15 12:09:46 +02:00
Sergei Golubchik
3d2d060b62 fix gcc 8 compiler warnings
There were two newly enabled warnings:
1. cast for a function pointers. Affected sql_analyse.h, mi_write.c
   and ma_write.cc, mf_iocache-t.cc, mysqlbinlog.cc, encryption.cc, etc

2. memcpy/memset of nontrivial structures. Fixed as:
* the warning disabled for InnoDB
* TABLE, TABLE_SHARE, and TABLE_LIST got a new method reset() which
  does the bzero(), which is safe for these classes, but any other
  bzero() will still cause a warning
* Table_scope_and_contents_source_st uses `TABLE_LIST *` (trivial)
  instead of `SQL_I_List<TABLE_LIST>` (not trivial) so it's safe to
  bzero now.
* added casts in debug_sync.cc and sql_select.cc (for JOIN)
* move assignment method for MDL_request instead of memcpy()
* PARTIAL_INDEX_INTERSECT_INFO::init() instead of bzero()
* remove constructor from READ_RECORD() to make it trivial
* replace some memcpy() with c++ copy assignments
2019-03-14 16:33:17 +01:00
Marko Mäkelä
d3afdb1e8f Datafile::validate_first_page(): Change some ERROR to Note
On startup, if the InnoDB doublewrite buffer can be used to
recover a corrupted page, raising an ERROR about a recoverable
error seems inappropriate. Issue Note instead, and adjust
tests accordingly.

Also, correctly validate the tablespace ID in the files.
2019-03-14 10:15:50 +02:00
Marko Mäkelä
e63f621652 Remove references to MySQL 5.7 native InnoDB partitioning
The native InnoDB partitioning was never enabled in MariaDB.
Remove some declarations and comments referring to it.
2019-03-13 13:32:04 +02:00
Marko Mäkelä
56304875f7 MDEV-18836 ASAN: heap-use-after-free after TRUNCATE
row_drop_tables_for_mysql_in_background(): Copy the table name
before closing the table handle, to avoid heap-use-after-free if
another thread succeeds in dropping the table before
row_drop_table_for_mysql_in_background() completes the table name lookup.

dict_mem_create_temporary_tablename(): With innodb_safe_truncate=ON
(the default), generate a simple, unique, collision-free table name
using only the id, no pseudorandom component. This is safe, because
on startup, we will drop any #sql tables that might exist in InnoDB.
This is a backport from 10.3. It should have been backported already
as part of backporting MDEV-14717,MDEV-14585 which were prerequisites
for the MDEV-13564 backup-friendly TRUNCATE TABLE.
This seems to reduce the chance of table creation failures in
ha_innobase::truncate().

ha_innobase::truncate(): Do not invoke close(), but instead
mimic it, so that we can restore to the original table handle
in case opening the truncated copy of the table failed.
2019-03-13 13:31:45 +02:00
Jan Lindström
d0ebb155fe MDEV-18577: Indexes problem on import dump SQL
Problem was that we skipped background persistent statistics calculation
on applier nodes if thread is marked as high priority (a.k.a BF).
However, on applier nodes all DDL which is replicate will be executed
as high priority i.e BF.

Fixed by allowing background persistent statistics calculation on
applier nodes even when thread is marked as BF. This could lead
BF lock waits but for queries on that node needs that statistics.
2019-03-13 10:18:12 +02:00
Marko Mäkelä
ee17f7285a TruncateLogParser::scan(): Simplify string operations
This also fixes -Wstringop-truncation that GCC 8 issued for strncpy().
2019-03-12 19:46:28 +02:00
Marko Mäkelä
945edfff2b Silence GCC 8 -Wno-class-memaccess 2019-03-12 19:46:28 +02:00
Marko Mäkelä
bef947b4c9 MDEV-18902 Uninitialized variable in recv_parse_log_recs()
recv_parse_log_recs(): Do not compare type if ptr==end_ptr
(we have reached the end of the redo log parsing buffer),
because it will not have been correctly initialized in that case.
2019-03-12 14:03:10 +02:00
Marko Mäkelä
e070cfe398 MDEV-18878: Fix GCC -flifetime-dse
GCC 6 and later can optimize away the memset() that is part of
mem_heap_zalloc() in a placement new call. So, instead of relying
on that kind of initialization, explicitly initialize the necessary
fields in the constructors.

que_common_t::que_common_t(): Initialize more fields in the
default constructor.

purge_vcol_info_t::purge_vcol_info_t(): Initialize all fields in
the default constructor.

purge_node_t::purge_node_t(): Initialize all necessary fields.

Reference:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388

    https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html
2019-03-12 13:56:58 +02:00
Marko Mäkelä
e374755bae Merge 10.1 into 10.2 2019-03-12 13:11:07 +02:00
Marko Mäkelä
32de60bb2e MDEV-18749: Fix GCC -flifetime-dse
row_merge_create_fts_sort_index(): Initialize dict_col_t in
an unambiguous way. GCC 6 and later appear to be able to optimize
away the memset() that is part of mem_heap_zalloc() in the
placement new call. Let us avoid using placement new in order
to ensure that the objects will actually be initialized.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71388

https://gcc.gnu.org/ml/gcc/2016-02/msg00207.html

While the latter reference hints that the optimization is only
applicable to non-POD types (and dict_col_t does not define
any member functions before 10.2), it is most consistent to
use the same initialization across all versions.
2019-03-12 13:03:20 +02:00
Marko Mäkelä
28e713dc12 MDEV-18878: Correct a condition
Initialize node->trx_id before checking if a table can be skipped.
2019-03-11 17:29:46 +02:00
Marko Mäkelä
6e76704613 MDEV-18878: Slimmer purge in non-debug builds
purge_node_t::in_progress: Replaces purge_node_t::done.
Only present in debug builds.

purge_node_t::start(): Moved from the start of row_purge_step().

purge_node_t::end(): Replaces row_purge_end().

trx_purge_attach_undo_recs(): Omit a check from non-debug builds.
2019-03-11 17:18:37 +02:00
Marko Mäkelä
1ab049e572 MDEV-18878 Purge: Optimize away futile table lookups
If a table has been dropped, rebuilt, or its tablespace has been
discarded or the table is corrupted, it does not make sense to
look up that table again while purging old undo log records.

purge_node_t::purge_node_t(): Replaces row_purge_node_create().

que_common_t::que_common_t(): Constructor.

row_import_update_index_root(): Remove the constant parameter
dict_locked=true, and update the table->def_trx_id in the cache.

purge_node_t::unavailable_table_id: The latest unavailable table ID,
to avoid future lookups.

purge_node_t::def_trx_id: The latest modification of the table
identified by unavailable_table_id, or TRX_ID_MAX.

purge_node_t::is_skipped(): Determine if a table should be skipped.

purge_node_t::skip(): Note that a table should be skipped.
2019-03-11 17:17:24 +02:00
Marko Mäkelä
b4cda8bbbc After-merge fix for GCC
GCC does not like MY_ATTRIBUTE((nonnull)) on a reference-to-pointer
parameter. clang did not flag an issue wit that.
2019-03-07 18:54:53 +02:00
Marko Mäkelä
913e33e423 Merge 10.1 into 10.2
Rewrite the MDEV-13818 fix to prevent heap-use-after-free.

Add a test case for MDEV-18272.
2019-03-07 17:52:27 +02:00
Marko Mäkelä
e3adf96aeb MDEV-13818 CREATE INDEX leaks memory if running out of undo log space
row_merge_create_index_graph(): Relay the internal state
from dict_create_index_step(). Our caller should free the index
only if it was not copied, added to the cache, and freed.

row_merge_create_index(): Free the index template if it was
not added to the cache. This is a safer variant of the logic
that was introduced in 65070beffd in 10.2.

prepare_inplace_alter_table_dict(): Add additional fault injection
to exercise a code path where we have already added an index
to the cache.
2019-03-07 15:35:55 +02:00
Marko Mäkelä
4c0f43f45a Merge 10.0 into 10.1 2019-03-07 12:27:42 +02:00
Marko Mäkelä
0a0eed8016 Merge 5.5 into 10.0 2019-03-07 12:04:22 +02:00
Marko Mäkelä
8024f8c6b8 MDEV-18272 InnoDB fails to rollback after exceeding FOREIGN KEY recursion depth
row_mysql_handle_errors(): Correct the wrong error handling for
the code DB_FOREIGN_EXCEED_MAX_CASCADE that was introduced in

c0923d396a

    commit 35f5429eda
    Author: Jimmy Yang <jimmy.yang@oracle.com>
    Date:   Wed Oct 6 06:55:34 2010 -0700

        Manual port Bug #Bug #54582 "stack overflow when opening many tables
        linked with foreign keys at once" from mysql-5.1-security to
        mysql-5.5-security again.

        rb://391 approved by Heikki

No known test case exists for repeating the bug before MariaDB 10.2.
The scenario should be that DB_FOREIGN_EXCEED_MAX_CASCADE is returned,
then InnoDB wrongly skips the rollback to the start of the current
row operation, and finally the SQL layer commits the transaction.
Normally the SQL layer would roll back either the entire transaction or
to the start of the statement. In the faulty scenario, InnoDB would
leave the transaction in an inconsistent state, and the SQL layer could
commit the transaction.
2019-03-07 11:57:14 +02:00
Sergei Golubchik
65070beffd MDEV-13818 CREATE INDEX leaks memory if running out of undo log space
free already allocated indexes if row_merge_create_index() fails

This fixes innodb.alter_crash failure
in ASAN_OPTIONS="abort_on_error=1" runs
2019-03-06 15:28:27 +01:00
Marko Mäkelä
c155946c90 Merge 10.1 into 10.2 2019-03-06 15:15:59 +02:00
Marko Mäkelä
485dcb07d1 MDEV-18637 Assertion `cache' failed in fts_init_recover_doc
I know no test case for this bug in 10.1. So a test case will be
committed separately in 10.2

fts_reset_get_doc(): properly initialize fts_get_doc_t::cache
2019-03-06 14:46:58 +02:00
Marko Mäkelä
4b5dc47f56 MDEV-18659: Revert a non-functional change
fts_fetch_index_words(): Restore the initialization len=0.
The test innodb_fts.create in 10.2 would end up in an infinite loop
if this assignment is removed, because a following iteration of the
while() loop would assign zip->zp->avail_in=len with the original value
instead of the 0 that was reset in the previous iteration.
2019-03-06 12:45:54 +02:00
Marko Mäkelä
b761211685 MDEV-18659: Fix string truncation/overflow in InnoDB and XtraDB
Fix the warnings issued by GCC 8 -Wstringop-truncation
and -Wstringop-overflow in InnoDB and XtraDB.

This work is motivated by Jan Lindström. The patch mainly differs
from his original one as follows:

(1) We remove explicit initialization of stack-allocated string buffers.
The minimum amount of initialization that is needed is a terminating
NUL character.
(2) GCC issues a warning for invoking strncpy(dest, src, sizeof dest)
because if strlen(src) >= sizeof dest, there would be no terminating
NUL byte in dest. We avoid this problem by invoking strncpy() with
a limit that is 1 less than the buffer size, and by always writing
NUL to the last byte of the buffer.
(3) We replace strncpy() with memcpy() or strcpy() in those cases
when the result is functionally equivalent.

Note: fts_fetch_index_words() never deals with len==UNIV_SQL_NULL.
This was enforced by an assertion that limits the maximum length
to FTS_MAX_WORD_LEN. Also, the encoding that InnoDB uses for
the compressed fulltext index is not byte-order agnostic, that is,
InnoDB data files that use FULLTEXT INDEX are not portable between
big-endian and little-endian systems.
2019-03-06 11:22:27 +02:00
Marko Mäkelä
b21930fb0f MDEV-18749: Uninitialized value upon ADD FULLTEXT INDEX
row_merge_create_fts_sort_index(): Initialize dict_col_t.

This fixes an access to uninitialized dict_col_t::ind when a debug
assertion in MariaDB 10.4 invokes is_dropped() in
rec_get_converted_size_comp_prefix_low(). Older MariaDB versions
seem to be unaffected by the uninitialized values, but it should
not hurt to initialize everything.
2019-03-06 10:37:43 +02:00
Marko Mäkelä
730ed9907b After-merge fix: Add explicit type conversion
Lesson learned: A HA_TOPTION_SYSVAR that is bound to
MYSQL_THDVAR_UINT does not have the type uint, but ulonglong.
2019-03-04 18:38:42 +02:00
Marko Mäkelä
9835f7b80f Merge 10.1 into 10.2 2019-03-04 16:46:58 +02:00
Marko Mäkelä
91e4f00389 MDEV-18732 InnoDB: ALTER IGNORE returns error for NULL
Only starting with MariaDB 10.3.8 (MDEV-16365), InnoDB can actually
handle ALTER IGNORE TABLE correctly when introducing a NOT NULL
attribute to a column that contains a NULL value. Between
MariaDB Server 10.0 and 10.2, we would incorrectly return an error
for ALTER IGNORE TABLE when the column contains a NULL value.
2019-03-04 15:16:27 +02:00
Marko Mäkelä
4a1c66290d MDEV-18775 Fix ALTER TABLE error handling for DROP INDEX
On an error (such as when an index cannot be dropped due to
FOREIGN KEY constraints), the field dict_index_t::to_be_dropped
was only being cleared in debug builds, even though the field
is available and being used also in non-debug builds.

This was a regression that was introduced by myself originally
in MySQL 5.7.6 and later merged to MariaDB 10.2.2, in
d39898de8e

An error manifested itself in the MariaDB Server 10.4 non-debug build,
involving instant ADD or DROP column. Because an earlier failed
ALTER TABLE operation incorrectly left the dict_index_t::to_be_dropped
flag set, the column pointers of the index fields would fail to be
adjusted for instant ADD or DROP column (MDEV-15562). The instant
ADD COLUMN in MariaDB Server 10.3 is unlikely to be affected by a
similar scenario, because dict_table_t::instant_add_column() in 10.3
is applying the transformations to all indexes, not skipping
to-be-dropped ones.
2019-03-01 12:16:12 +02:00
Marko Mäkelä
e39d6e0c53 MDEV-18601 Can't create table with ENCRYPTED=DEFAULT when innodb_default_encryption_key_id!=1
The problem with the InnoDB table attribute encryption_key_id is that it is
not being persisted anywhere in InnoDB except if the table attribute
encryption is specified and is something else than encryption=default.
MDEV-17320 made it a hard error if encryption_key_id is specified to be
anything else than 1 in that case.

Ideally, we would always persist encryption_key_id in InnoDB. But, then we
would have to be prepared for the case that when encryption is being enabled
for a table whose encryption_key_id attribute refers to a non-existing key.

In MariaDB Server 10.1, our best option remains to not store anything
inside InnoDB. But, instead of returning the error that MDEV-17320
introduced, we should merely issue a warning that the specified
encryption_key_id is going to be ignored if encryption=default.

To improve the situation a little more, we will issue a warning if
SET [GLOBAL|SESSION] innodb_default_encryption_key_id is being set
to something that does not refer to an available encryption key.

Starting with MariaDB Server 10.2, thanks to MDEV-5800, we could open the
table definition from InnoDB side when the encryption is being enabled,
and actually fix the root cause of what was reported in MDEV-17320.
2019-02-28 23:20:31 +02:00
Jan Lindström
f65f40bb35 Merge remote-tracking branch 'origin/10.1' into 10.2 2019-02-28 13:08:11 +02:00
Julius Goryavsky
2c734c980e MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-26 07:45:11 +02:00
Julius Goryavsky
243f829c1c MDEV-9519: Data corruption will happen on the Galera cluster size change
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.

In the title of the MDEV-9519 it was proposed to ban start slave on a Galera
if master binlog_format = statement and wsrep_auto_increment_control = 1,
but the problem can be solved without such a restriction.

The causes and fixes:

1. We need to improve processing of changing the auto-increment values
after changing the cluster size.

2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.

3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).

https://jira.mariadb.org/browse/MDEV-9519
2019-02-25 11:19:07 +02:00
Marko Mäkelä
15a20367fb buf_page_get_zip(): Deduplicate some code 2019-02-23 10:31:07 +02:00
Marko Mäkelä
2c8d9a4e59 MDEV-10813: Update buf_page_t::buf_fix_count outside mutex
Since MySQL 5.6.16 (and MariaDB Server 10.0.11), changes of
buf_page_t::buf_fix_count are atomic memory operations if
PAGE_ATOMIC_REF_COUNT is defined. Since MySQL 5.7
(and MariaDB Server 10.2.2), the field is always updated
by atomic memory operations.

In a few occurrences, updates of the counter were unnecessarily
surrounded by an acquisition and release of the block mutex
(buf_block_t::mutex or buf_pool_t::zip_mutex). Remove these
unnecessary mutex operations.
2019-02-22 22:56:22 +02:00
Eugene Kosov
28cb041754 MDEV-18662 ib_wqueue_t has a data race
ib_wqueue_is_empty(): protect ib_list_is_empty() call

Closes #1202
2019-02-21 12:23:47 +02:00
Thirunarayanan Balathandayuthapani
88b6dc4db5 MDEV-18639 Assertion failure upon attempt to start with lower number of undo tablespaces
trx_rseg_t::is_persistent(): Correct a bogus debug assertion.
2019-02-19 12:18:50 +02:00
Marko Mäkelä
ca76fc4a3a MDEV-18611: mariabackup silently ended during xtrabackup_copy_logfile()
log_group_read_log_seg(): Always return false when returning
before reading end_lsn.

xtrabackup_copy_logfile(): On error, indicate whether
a corrupt log record was encountered.

Only xtrabackup_copy_logfile() in Mariabackup cared about the
return value of the function. InnoDB crash recovery was not
affected by this bug.
2019-02-19 11:14:03 +02:00
Marko Mäkelä
e3f6ea5080 Merge 10.1 into 10.2 2019-02-18 22:21:02 +02:00
Marko Mäkelä
98e185ee37 MDEV-18630 Uninitialised value in FOREIGN KEY error message
dict_create_foreign_constraints_low(): Clean up the way in
which the error messages are initialized, and ensure that
the table name is always initialized.
2019-02-18 21:42:58 +02:00
Marko Mäkelä
9e4f299404 Merge 10.1 into 10.2 2019-02-11 11:42:18 +02:00
Marko Mäkelä
be25414828 MDEV-18016: Cover the no-rebuild case, and remove a bogus debug assertion
The code path where the table was not being rebuilt during ALTER TABLE
was not covered by the test. Add coverage, and remove the debug assertion
that could fail in this case.
2019-02-11 11:38:18 +02:00
Marko Mäkelä
e84dc567be Merge 10.1 into 10.2 2019-02-05 17:35:48 +02:00
Marko Mäkelä
5eb3e4d83c MDEV-15798 Mutex leak on accessing INFORMATION_SCHEMA.INNODB_MUTEXES
i_s_innodb_mutexes_fill_table(): Use the C++ RAII pattern
to ensure that the mutexes are released if an OK()
macro returns from the function prematurely.
2019-02-05 17:03:41 +02:00
Marko Mäkelä
625994b7cc MDEV-16982 Server crashes in mem_heap_dup upon DELETE from table with virtual columns
An uninitialized buffer is passed to row_sel_store_mysql_rec() but
InnoDB may not initialize everything. Looks like it's ok in most cases
but not always.
The partially initialized buffer was later passed to
ha_innobase::write_row() which reads random NULL bit values for
virtual columns and random stuff happens.

No test case for MariaDB 10.2 was found.
The test case for MariaDB 10.3 involves partitioning,
system versioning and the TRASH_ALLOC fill pattern 0xA5.
Test case depends very much on the number and layout of columns.
Think about 0xA5 byte for a NULL bit mask.

row_sel_store_mysql_rec(): always initialize virtual columns NULL bit

Closes #1144
2019-02-05 12:02:41 +02:00
Sergei Golubchik
78d5a764b2 compiler warnings 2019-02-05 01:34:17 +01:00
Marko Mäkelä
a249e57b68 Merge 10.1 into 10.2
Temporarily disable a test for
commit 2175bfce3e
because fixing it in 10.2 requires updating libmariadb.
2019-02-03 17:22:05 +02:00
Marko Mäkelä
955c7b3222 MDEV-16896 encryption.innodb-checksum-algorithm crashes
buf_page_is_corrupted(): Read the global variable srv_checksum_algorithm
only once in order to avoid a race condition when
SET GLOBAL innodb_checksum_algorithm=...;
is being executed concurrently with this function.
2019-02-03 17:09:20 +02:00
Marko Mäkelä
213ece2f2e Merge 10.1 into 10.1
This is joint work with Oleksandr Byelkin.
2019-02-02 13:00:15 +02:00
Marko Mäkelä
c1e1764fc4 Fix embedded innodb_plugin after 560799ebd8
wsrep_certification_rules: Define as a weak global symbol.
While there are separate _embedded.a for statically
linked storage engine plugins, there is only one ha_innodb.so
which is supposed to work with both values of WITH_WSREP.

The merge from 10.0-galera introduced a reference to a global
variable that is only defined when the server is built WITH_WSREP.
We must define that symbol as weak global, so that when
a dynamically linked InnoDB or XtraDB is used with the embedded
server (which never includes write-set replication patches),
the variable will be read as 0, instead of causing a failure to
load the InnoDB or XtraDB plugin.
2019-02-02 12:49:04 +02:00
Marko Mäkelä
081fd8bfa2 Merge 10.1 into 10.2 2019-02-02 11:40:02 +02:00
Thirunarayanan Balathandayuthapani
7c7161a1bd MDEV-18194 Incremental prepare tries to access page which is out of tablespace bounds
Problem:
=======
Mariabackup incremental prepare creates new tablespace when it encounter
new tablespace. It sets the intial size as FIL_IBD_FILE_INITIAL_SIZE (4).
But while applying redo log, it tries to access 5th page and then
it leads to out of tablespace error.

Fix:
===
While parsing the redo log record, track FSP_SIZE in recv_spaces for the
respective space id. Assign the recv_size for the tablespace when it
is loaded. Extend the tablespace depends on recv_size while applying
the redo log record.
2019-02-01 09:15:53 +02:00
Marko Mäkelä
6051842cd0 Minor clean-up 2019-02-01 09:13:18 +02:00
Thirunarayanan Balathandayuthapani
f669cecbe3 MDEV-18415 mariabackup.mdev-14447 test case fails with Table 'test.t' doesn't exist in engine
- Added retry logic if validation of first page fails with checksum
mismatch.
2019-02-01 08:53:50 +02:00
Oleksandr Byelkin
560799ebd8 Merge branch '10.0-galera' into 10.1 2019-01-31 09:34:34 +01:00
Thirunarayanan Balathandayuthapani
b8aef87221 MDEV-16849 Extending indexed VARCHAR column should be instantaneous
Analysis:
========
Increasing the length of the indexed varchar column is not an instant operation for
innodb.

Fix:
===
- Introduce the new handler flag 'Alter_inplace_info::ALTER_COLUMN_INDEX_LENGTH' to
indicate the index length differs due to change of column length changes.

- InnoDB makes the ALTER_COLUMN_INDEX_LENGTH flag as instant operation.

This is a port of Mysql fix.

    commit 913071c0b16cc03e703308250d795bc381627e37
    Author: Nisha Gopalakrishnan <nisha.gopalakrishnan@oracle.com>
    Date:   Wed May 30 14:54:46 2018 +0530

        BUG#26848813: INDEXED COLUMN CAN'T BE CHANGED FROM VARCHAR(15)
                      TO VARCHAR(40) INSTANTANEOUSLY
2019-01-30 15:33:32 +05:30
Marko Mäkelä
1522ee2949 MDEV-18016: Assertion failure on ALTER TABLE after foreign_key_checks=0
ha_innobase::commit_inplace_alter_table(): Do not crash if
innobase_update_foreign_cache() returns an error. It can return
an error on ALTER TABLE if an inconsistent FOREIGN KEY constraint
was created earlier when SET foreign_key_checks=0 was in effect.
Instead, report a warning to the client that constraints cannot
be loaded.
2019-01-29 15:20:26 +02:00
Marko Mäkelä
6699cac0bf MDEV-18256 Duplicated call to dict_foreign_remove_from_cache()
ha_innobase::prepare_inplace_alter_table(): Filter out duplicates
from ha_alter_info->alter_info->drop_list.elements.
2019-01-29 15:20:26 +02:00
Marko Mäkelä
5e06ee41a4 MDEV-18222: Duplicated call to dict_foreign_remove_from_cache()
innobase_rename_column_try(): Declare fk_evict as std::set
instead of std::list, in order to filter out duplicates.
2019-01-29 15:20:26 +02:00
Oleksandr Byelkin
724b09d5e7 Version fix after merge 2019-01-28 15:42:16 +01:00
Oleksandr Byelkin
94b68b35f4 Reverting part of da34c7de5d that was already fixed by MDEV-17531 by Marko 2019-01-28 15:39:27 +01:00
Jan Lindström
97930df13c
Merge pull request #1142 from codership/10.2-MDEV-15740
MDEV-15740 Fixes to Galera transaction recovery
2019-01-28 12:01:35 +02:00
Teemu Ollakka
040b840de7 MDEV-15740 Backport wsrep recovery fixes from 10.4.
Clear wsrep XID in innobase_rollback_by_xid() for recovered wsrep
transaction in order to avoid resetting XID storage when rolling back
wsrep transaction during recovery.

Sort wsrep XIDs read from storage engine in ascending order and
erify that the range is continuous during crash recovery. If binlog is off,
commit all recovered transactions for continuous seqno range. This is safe
because all transactions with wsrep XID have been certified and must be
committed in the cluster. On the other hand if binlog is on, respect binlog
as a transaction coordinator in order to avoid missing transactions in binlog
that have been committed into storage engine .
2019-01-25 16:19:20 +02:00
Eugene Kosov
31d0727a10 MDEV-18235: Changes related to fsync()
Remove fil_node_t::sync_event.

I had a discussion with kernel fellows and they said it's safe to call
fsync() simultaneously at least on VFS and ext4. So initially I wanted
to disable check for recent Linux but than I realized code is buggy.

Consider a case when one thread is inside fsync() and two others are
waiting inside os_event. First thread after fsync() calls os_event_set()
which is a broadcast! So two waiting threads will awake and may call
fsync() at the same time.

One fix is to add a notify_one() functionality to os_event but I decided
to remove incorrect check completely. Note, it works for one waiting
thread but not for more than one.

IMO it's ok to avoid existing bugs but there is not too much sense in
avoiding possible(!) bugs as this code does.

fil_space_t::is_in_rotation_list(), fil_space_t::is_in_unflushed_spaces():
Replace redundant bool fields with member functions.

fil_node_t::needs_flush: Replaces fil_node_t::modification_counter and
fil_node_t::flush_counter. We need to know whether there _are_ some
unflushed writes and we do not need to know _how many_ writes.

fil_system_t::modification_counter: Remove as not needed.
Even if we needed fil_node_t::modification_counter, every file
could have its own counter that would be incremented on each write.

fil_system_t::modification_counter is a global modification counter
for all files. It was incremented on every write. But whether some
file was flushed or not is an internal fil_node_t deal/state and
this makes fil_system_t::modification_counter useless.

Closes #1061
2019-01-25 15:40:04 +02:00
Marko Mäkelä
46f712c73c MDEV-15114: Fix memory leaks
When innobase_allocate_row_for_vcol() returns true (for failure),
it may already have invoked mem_heap_create(). However, some callers
would fail to invoke mem_heap_free().
2019-01-24 12:32:27 +02:00
Marko Mäkelä
d283f80eae Update the InnoDB version number 2019-01-23 19:46:35 +02:00
Aditya A
aa8a31dadd Bug #22990029 GCOLS: INCORRECT BEHAVIOR AFTER DATA INSERTED WITH IGNORE KEYWORD
PROBLEM
-------

1. We are inserting a base column entry which causes an invalid value
   by the function provided to generate virtual column,but we go ahead
   and insert this due to ignore keyword.
2. We then delete this record, making this record delete marked in innodb.
   If we try to insert another record with the same pk as the deleted
   record and if the rec is not purged ,then we try to undelete mark this
   record and try to build a update vector with previous and updated value
   and while calculating the value of virtual column we get error from
   server that we cannot calculate this from base column.
   Innodb assumes that innobase_get_computed_value() Should always return
   a valid value for the base column present in the row. The failure of
   this call was not handled ,so we were crashing.

FIX
2019-01-23 19:46:35 +02:00
Marko Mäkelä
9a7281a703 Merge 10.1 into 10.2 2019-01-23 15:09:06 +02:00
Marko Mäkelä
d06ebd932d Remove references to removed dict_sys->size 2019-01-23 14:42:21 +02:00
Marko Mäkelä
2565c02ca5 Remove unnecessary type casts 2019-01-23 14:42:21 +02:00
Marko Mäkelä
3b6d2efcb1 Merge 10.0 into 10.1 2019-01-23 14:34:23 +02:00
Marko Mäkelä
52d13036d8 MDEV-17933 slow server status - dict_sys_get_size()
dict_sys_get_size(): Replace the time-consuming loop with
a crude estimate that can be computed without holding any mutex.

Even before dict_sys->size was removed in MDEV-13325,
not all memory allocations by the InnoDB data dictionary cache
were being accounted for. One example is foreign key constraints.
Another example is virtual column metadata, starting with 10.2.
2019-01-23 13:44:20 +02:00