Commit graph

5390 commits

Author SHA1 Message Date
Marko Mäkelä
7bfe33ee28 MDEV-10814 innodb large allocations - Don't dump
Merge pull request #364
2018-02-21 19:15:20 +02:00
Marko Mäkelä
094cf73045 Avoid some dead code 2018-02-21 09:46:51 +02:00
Vladislav Vaintroub
56e7b7eaed Make possible to use clang on Windows (clang-cl)
-DWITH_ASAN can be used as well now, on x64

Fix many clang-cl warnings.
2018-02-20 21:17:36 +00:00
Marko Mäkelä
947efe17ed MDEV-15158 On commit, do not write to the TRX_SYS page
This is based on a prototype by
Thirunarayanan Balathandayuthapani <thiru@mariadb.com>.

Binlog and Galera write-set replication information was written into
TRX_SYS page on each commit. Instead of writing to the TRX_SYS during
normal operation, InnoDB can make use of rollback segment header pages,
which are already being written to during a commit.

The following list of fields in rollback segment header page are added:
    TRX_RSEG_BINLOG_OFFSET
    TRX_RSEG_BINLOG_NAME (NUL-terminated; empty name = not present)
    TRX_RSEG_WSREP_XID_FORMAT (0=not present; 1=present)
    TRX_RSEG_WSREP_XID_GTRID
    TRX_RSEG_WSREP_XID_BQUAL
    TRX_RSEG_WSREP_XID_DATA

trx_sys_t: Introduce the fields
recovered_binlog_filename, recovered_binlog_offset, recovered_wsrep_xid.

To facilitate upgrade from older mysql or mariaDB versions, we will read
the information in TRX_SYS page. It will be overridden by the
information that we find in rollback segment header pages.

Mariabackup --prepare will read the metadata from the rollback
segment header pages via trx_rseg_array_init(). It will still
not read any undo log pages or recover any transactions.
2018-02-20 21:36:36 +02:00
Marko Mäkelä
cd63f43c40 Fix the Windows build
my_atomic_load32() expects a pointer to non-const on Windows.
2018-02-20 13:08:43 +02:00
Marko Mäkelä
1fa14a7c08 Replace trx_undo_mem_free() with ut_free() 2018-02-20 10:30:14 +02:00
Marko Mäkelä
60a68fdf71 Clarify the access to trx_sys.rseg_history_len
trx_sys_t::rseg_history_len: Make private, and clarify the
documentation.

trx_sys_t::history_size(): Read rseg_history_len.

trx_sys_t::history_insert(), trx_sys_t::history_remove(),
trx_sys_t::history_add(): Update rseg_history_len.
2018-02-20 10:30:08 +02:00
Marko Mäkelä
5521994ce2 Pull request #614: various small code changes 2018-02-19 11:37:45 +02:00
Marko Mäkelä
2ba487cfe8 Merge bb-10.2-ext into 10.3 2018-02-19 11:37:29 +02:00
Marko Mäkelä
278c036275 Merge 10.2 into bb-10.2-ext 2018-02-19 09:01:06 +02:00
Marko Mäkelä
3c419fde5f Cleanup after commit ac8e3c85a4
srv_conc_t::n_active: Correct the comment, and remove an
assertion that trivially holds now that the type is unsigned.
2018-02-19 08:58:22 +02:00
Vladislav Vaintroub
acab33a1f2 Merge branch '10.2-backup-fixes' into 10.2 2018-02-18 10:49:46 +00:00
Marko Mäkelä
970ce270c9 Merge 10.1 into 10.2
Disable the test encryption.innodb_encryption-page-compression
because the wait_condition would seem to time out deterministically.
MDEV-14814 has to be addressed in 10.2 separately.

Datafile::validate_first_page(): Do not invoke
page_size_t::page_size_t(flags) before validating the tablespace flags.
This avoids a crash in MDEV-15333 innodb.restart test case.
FIXME: Reduce the number of error messages. The first one is enough.
2018-02-17 14:54:12 +02:00
Marko Mäkelä
9a46d97149 MDEV-15333 MariaDB (still) slow start
This performance regression was introduced in the MariaDB 10.1
file format incompatibility bug fix MDEV-11623 (MariaDB 10.1.21
and MariaDB 10.2.4) and partially fixed in MariaDB 10.1.25 in
MDEV-12610 without adding a regression test case.

On a normal startup (without crash recovery), InnoDB should not read
every .ibd data file, because this is slow. Like in MySQL, for now,
InnoDB will still open every data file (without reading), and it
will read every .ibd file for which an .isl file exists, or the
DATA DIRECTORY attribute has been specified for the table.

The test case shuts down InnoDB, moves data files, replaces them
with garbage, and then restarts InnoDB, expecting no messages to
be issued for the garbage files. (Some messages will for now be
issued for the table that uses the DATA DIRECTORY attribute.)
Finally, the test shuts down the server, restores the old data files,
and restarts again to drop the tables.

fil_open_single_table_tablespace(): Remove the condition on flags,
and only call fsp_flags_try_adjust() if validate==true
(reading the first page has been requested). The only caller with
validate==false is at server startup when we are processing all
records from SYS_TABLES. The flags passed to this function are
actually derived from SYS_TABLES.TYPE and SYS_TABLES.N_COLS,
and there never was any problem with SYS_TABLES in MariaDB 10.1.
The problem that MDEV-11623 was that incorrect tablespace flags
were computed and written to FSP_SPACE_FLAGS.
2018-02-17 14:20:33 +02:00
Daniel Black
b600f30786 MDEV-10814: Innodb large allocations - madvise - Don't dump
Note: Linux only

Core dumps of large buffer pool pages take time and space
and pose potential data expose in scenarios where data-at-rest
encryption is deployed.

Here we use madvise(MADV_DONT_DUMP) on large memory allocations
used by the innodb buffer pool, log_sys and recv_sys. The effect
of this system call is that these memory areas will not appear in
a core dump. Data from these buffers is rarely useful in fault
diagnosis.

log_sys and recv_sys structures now use large memory allocations
for their large buffer.

Debug builds don't include the madvise syscall and as such will
include full core dumps.

A function, buf_madvise_do_dump, is added but never called. It
is there to be called from a debugger to re-enable the core
dumping of all of these pages if for some reason the entire
contents of these buffers are needed.

Idea thanks to Hartmut Holzgraefe
2018-02-17 20:00:56 +11:00
Eugene Kosov
f02f1eda7e review fixes 2018-02-16 22:15:51 +03:00
Eugene Kosov
6de8f79b11 remove unneeded variable 2018-02-16 21:44:51 +03:00
Eugene Kosov
de4c9f460c change some ibool to bool 2018-02-16 21:44:51 +03:00
Eugene Kosov
e14790b89d let buf_page_hash_lock_get() be function, not macro 2018-02-16 21:44:51 +03:00
Eugene Kosov
365f478240 make buf_block_t::lock_hash_val uint32_t 2018-02-16 21:44:51 +03:00
Eugene Kosov
af2d260b37 merge btr_page_get_level_low() and btr_page_get_level() 2018-02-16 21:44:51 +03:00
Eugene Kosov
9d46bd8a24 flst_add_to_empty(): read len only in debug build 2018-02-16 21:44:50 +03:00
Eugene Kosov
aaf71116ee remove unused stuff:
que_cur_t
que_fork_t::cur_end
que_fork_t::cur_pos
que_fork_t::cur_on_row
2018-02-16 21:44:50 +03:00
Eugene Kosov
399ec84847 prettify lock_rec_has_to_wait() 2018-02-16 21:44:50 +03:00
Eugene Kosov
e4fa5492a9 prettify lock_has_to_wait 2018-02-16 21:44:50 +03:00
Marko Mäkelä
2bb19230d8 MDEV-15323 Follow-up to MDEV-14905: Skip FTS processing if innodb_read_only
fts_cmp_set_sync_doc_id(), fts_load_stopword(): Start the transaction
in read-only mode if innodb_read_only is set.

fts_update_sync_doc_id(), fts_commit_table(), fts_sync(),
fts_optimize_table(): Return DB_READ_ONLY if innodb_read_only is set.

fts_doc_fetch_by_doc_id(), fts_table_fetch_doc_ids():
Remove the code to start an internal transaction or to roll back,
because this is a read-only operation.
2018-02-15 15:00:46 +00:00
Marko Mäkelä
03400d974d Dead code removal: sess_t
The session object is not really needed for anything.
We can directly create and free the dummy purge_sys->query->trx.
2018-02-15 15:00:46 +00:00
Marko Mäkelä
37af958d23 MDEV-14905 Fulltext index modification committed during shutdown
If CREATE TABLE...FULLTEXT INDEX was initiated right before shutdown,
then the function fts_load_stopword() could commit modifications
after shutdown was initiated, causing an assertion failure in
the function trx_purge_add_update_undo_to_history().

Mark as internal all the read/write transactions that
modify fulltext indexes, so that they will be ignored by
the assertion that guards against transaction commits
after shutdown has been initiated.

fts_optimize_free(): Invoke trx_commit_for_mysql() just in case,
because in fts_optimize_create() we started the transaction as
internal, and fts_free_for_backgruond() would assert that the
flag is clear. Transaction commit would clear the flag.
2018-02-15 15:00:46 +00:00
Marko Mäkelä
6f314edac7 MDEV-14648 Restore fix for MySQL BUG#39053 - UNINSTALL PLUGIN does not allow the storage engine to cleanup open connections
Also, allow the MariaDB 10.2 server to link InnoDB dynamically
against ha_innodb.so (which is what mysql-test-run.pl expects
to exist, instead of the default name ha_innobase.so).

wsrep_load_data_split(): Instead of referring to innodb_hton_ptr,
check the handlerton::db_type. This was recently broken by me in
MDEV-11415.

innodb_lock_schedule_algorithm: Define as a weak global symbol,
so that WITH_WSREP will not depend on InnoDB being linked statically.
I tested this manually. Notably, running a test that only does
	SET GLOBAL wsrep_on=1;
with a static or dynamic InnoDB and
	./mtr --mysqld=--loose-innodb-lock-schedule-algorithm=fcfs
will crash with SIGSEGV at shutdown. With the default VATS
combination the wsrep_on is properly refused for both the
static and dynamic InnoDB.

ha_close_connection(): Do invoke the method also for plugins
for which UNINSTALL PLUGIN was deferred due to open connections.
Thanks to @svoj for pointing this out.

thd_to_trx(): Return a pointer, not a reference to a pointer.

check_trx_exists(): Invoke thd_set_ha_data() for assigning a transaction.

log_write_checkpoint_info(): Remove an unused DEBUG_SYNC point
that would cause an assertion failure on shutdown after deferred
UNINSTALL PLUGIN.

This was tested as follows:

cmake -DWITH_WSREP=1 -DPLUGIN_INNOBASE:STRING=DYNAMIC \
-DWITH_MARIABACKUP:BOOL=OFF ...
make
cd mysql-test
./mtr innodb.innodb_uninstall
2018-02-15 15:00:46 +00:00
Marko Mäkelä
633d252e32 Merge bb-10.2-ext into 10.3 2018-02-15 16:12:15 +02:00
Marko Mäkelä
4074c74556 Merge 10.2 into bb-10.2-ext 2018-02-15 15:41:47 +02:00
Marko Mäkelä
5ab4602810 MDEV-15323 Follow-up to MDEV-14905: Skip FTS processing if innodb_read_only
fts_cmp_set_sync_doc_id(), fts_load_stopword(): Start the transaction
in read-only mode if innodb_read_only is set.

fts_update_sync_doc_id(), fts_commit_table(), fts_sync(),
fts_optimize_table(): Return DB_READ_ONLY if innodb_read_only is set.

fts_doc_fetch_by_doc_id(), fts_table_fetch_doc_ids():
Remove the code to start an internal transaction or to roll back,
because this is a read-only operation.
2018-02-15 15:38:42 +02:00
Marko Mäkelä
cc3b5d1fe7 Merge bb-10.2-ext into 10.3 2018-02-15 11:48:30 +02:00
Vladislav Vaintroub
f082c7557e fix signed/unsigned mismatch on Windows 2018-02-15 09:02:37 +00:00
Marko Mäkelä
22770a9f9a Merge 10.2 into bb-10.2-ext 2018-02-15 10:57:27 +02:00
Marko Mäkelä
b006d2ead4 Merge bb-10.2-ext into 10.3 2018-02-15 10:22:03 +02:00
Marko Mäkelä
27ea2963fc Dead code removal: sess_t
The session object is not really needed for anything.
We can directly create and free the dummy purge_sys->query->trx.
2018-02-15 10:01:05 +02:00
Marko Mäkelä
7baea2efa2 MDEV-14905 Fulltext index modification committed during shutdown
If CREATE TABLE...FULLTEXT INDEX was initiated right before shutdown,
then the function fts_load_stopword() could commit modifications
after shutdown was initiated, causing an assertion failure in
the function trx_purge_add_update_undo_to_history().

Mark as internal all the read/write transactions that
modify fulltext indexes, so that they will be ignored by
the assertion that guards against transaction commits
after shutdown has been initiated.

fts_optimize_free(): Invoke trx_commit_for_mysql() just in case,
because in fts_optimize_create() we started the transaction as
internal, and fts_free_for_backgruond() would assert that the
flag is clear. Transaction commit would clear the flag.
2018-02-15 09:59:04 +02:00
Marko Mäkelä
5fe9b4a7ae MDEV-14648 Restore fix for MySQL BUG#39053 - UNINSTALL PLUGIN does not allow the storage engine to cleanup open connections
Also, allow the MariaDB 10.2 server to link InnoDB dynamically
against ha_innodb.so (which is what mysql-test-run.pl expects
to exist, instead of the default name ha_innobase.so).

wsrep_load_data_split(): Instead of referring to innodb_hton_ptr,
check the handlerton::db_type. This was recently broken by me in
MDEV-11415.

innodb_lock_schedule_algorithm: Define as a weak global symbol,
so that WITH_WSREP will not depend on InnoDB being linked statically.
I tested this manually. Notably, running a test that only does
	SET GLOBAL wsrep_on=1;
with a static or dynamic InnoDB and
	./mtr --mysqld=--loose-innodb-lock-schedule-algorithm=fcfs
will crash with SIGSEGV at shutdown. With the default VATS
combination the wsrep_on is properly refused for both the
static and dynamic InnoDB.

ha_close_connection(): Do invoke the method also for plugins
for which UNINSTALL PLUGIN was deferred due to open connections.
Thanks to @svoj for pointing this out.

thd_to_trx(): Return a pointer, not a reference to a pointer.

check_trx_exists(): Invoke thd_set_ha_data() for assigning a transaction.

log_write_checkpoint_info(): Remove an unused DEBUG_SYNC point
that would cause an assertion failure on shutdown after deferred
UNINSTALL PLUGIN.

This was tested as follows:

cmake -DWITH_WSREP=1 -DPLUGIN_INNOBASE:STRING=DYNAMIC \
-DWITH_MARIABACKUP:BOOL=OFF ...
make
cd mysql-test
./mtr innodb.innodb_uninstall
2018-02-15 09:59:03 +02:00
Vladislav Vaintroub
ac8e3c85a4 MDEV-15295 Type mismatch for srv_fatal_semaphore_wait_threshold 2018-02-14 18:39:56 +00:00
Sergey Vojtovich
b782971c58 MDEV-15246 - premature history data deletion
This is regression after bc7a1dc1fb of
MDEV-15104 - Optimise MVCC snapshot.

Aforementioned revision removes mutex lock around ReadView creation,
which allows views to be created concurrently. Effectively it
invalidates "oldest view" approach: no single view can be considered
oldest anymore. Instead we have to iterate trx_sys.m_views to find
min(m_low_limit_no), min(m_low_limit_id) and all transaction ids below
min(m_low_limit_id), which would form oldest view.

Second regression comes from c0d5d7c0ef
of MDEV-15104 - Optimise MVCC snapshot.

It removes mutex protection around trx->no assignment, which opens up
a gap between m_max_trx_id increment and transaction serialisation
number becoming visible through rw_trx_hash. While we're in this gap
concurrent thread may come and do MVCC snapshot without seeing allocated
but not yet assigned serialisation number. Then at some point purge
thread may clone this view. As a result it won't see newly allocated
serialisation number and may remove "unnecessary" history data of this
transaction from rollback segments.
2018-02-14 22:39:24 +04:00
Marko Mäkelä
0c4aeef976 Merge 10.2 into bb-10.2-ext 2018-02-13 16:51:45 +02:00
Marko Mäkelä
d9955b22e9 Merge 10.1 into 10.2 2018-02-13 14:49:47 +02:00
Marko Mäkelä
2202afd541 Merge 10.0 into 10.1 2018-02-13 14:32:17 +02:00
Marko Mäkelä
c051eaba46 MDEV-14988 innodb_read_only tries to modify files if transactions were recovered in COMMITTED state
lock_trx_release_locks(): Relax a debug assertion to allow
recovered TRX_STATE_COMMITTED_IN_MEMORY transactions.

trx_commit_in_memory(): Add DEBUG_SYNC instrumentation.

trx_undo_insert_cleanup(): Skip persistent changes if innodb_read_only
is set. This should only happen when a recovered committed transaction
would be cleaned up at shutdown.
2018-02-13 14:29:32 +02:00
Marko Mäkelä
00f0c039d2 MDEV-15270 Mariabackup should not try to use doublewrite buffer
When Mariabackup gets a bad read of the first page of the system
tablespace file, it would inappropriately try to apply the doublewrite
buffer and write changes back to the data file (to the source file)!
This is very wrong and must be prevented.

The correct action would be to retry reading the system tablespace
as well as any other files whose first page was read incorrectly.
Fixing this was not attempted.

xb_load_tablespaces(): Shorten a bogus message to be more relevant.
The message can be displayed by --backup or --prepare.

xtrabackup_backup_func(), os_file_write_func(): Add a missing space
to a message.

Datafile::restore_from_doublewrite(): Do not even attempt the
operation in Mariabackup.

recv_init_crash_recovery_spaces(): Do not attempt to restore the
doublewrite buffer in Mariabackup (--prepare or --export), because
all pages should have been copied correctly in --backup already,
and because --backup should ignore the doublewrite buffer.

SysTablespace::read_lsn_and_check_flags(): Do not attempt to initialize
the doublewrite buffer in Mariabackup.

innodb_make_page_dirty(): Correct the bounds check.

Datafile::read_first_page(): Correct the name of the parameter.
2018-02-12 16:56:01 +02:00
Marko Mäkelä
33f70c4d9c MDEV-13869 MariaDB slow start
When code from MySQL 5.7.9 was merged to MariaDB 10.2.2
in commit 2e814d4702
an assignment validate=true was inadvertently added to the function
dict_check_sys_tables().

This causes InnoDB to open every single .ibd file on startup, even
when no crash recovery was needed.

Simply removing the assignment would make some tests fail. We do the
best to retain almost the same level of inconsistency detection.
In the test innodb.table_flags, access to one of the tables will not
be blocked despite inconsistent flags.

dict_check_sys_tables(): Remove the problematic assignment, and skip
validation in normal startup.

dict_load_table_one(): If the .ibd file cannot be opened, mark the
table as corrupted and unreadable.

fil_node_open_file(): Validate FSP_SPACE_FLAGS with the expected
flags. If reading the tablespace fails, invalidate node->handle
instead of letting it remain stale. This bug was caught by a
fil_validate() assertion failure.

fsp_flags_try_adjust(): If the tablespace file is invalid, do nothing.
2018-02-12 14:06:04 +02:00
Alexander Barkov
da99e086f9 Merge remote-tracking branch 'origin/10.2' into bb-10.2-ext 2018-02-12 10:03:28 +04:00
Sergei Golubchik
49bcc82686 Merge branch '10.1' into 10.2 2018-02-11 13:47:16 +01:00
Alexey Botchkov
b88542681b MDEV-14611 ALTER TABLE EXCHANGE PARTITION does not work properly when used with DATA DIRECTORY.
When table is renamed, the InnoDB's dictionary cache didn't
        change the ib_table->data_dir_path accordingly.
        Now it's set to NULL.
2018-02-10 22:17:49 +04:00