Commit graph

1095 commits

Author SHA1 Message Date
Sergei Golubchik
b34cafe9d9 cleanup: move thread_count to THD_count::value()
because the name was misleading, it counts not threads, but THDs,
and as THD_count is the only way to increment/decrement it, it
could as well be declared inside THD_count.
2021-07-24 12:37:50 +02:00
Marko Mäkelä
b50ea90063 Merge 10.2 into 10.3 2021-07-22 18:57:54 +03:00
Marko Mäkelä
ca501ffb04 MDEV-26195: Use a 32-bit data type for some tablespace fields
In the InnoDB data files, we allocate 32 bits for tablespace identifiers
and page numbers as well as tablespace flags. But, in main memory
data structures we allocate 32 or 64 bits, depending on the register
width of the processor. Let us always use 32-bit fields to eliminate
a mismatch and reduce the memory footprint on 64-bit systems.
2021-07-22 11:22:47 +03:00
Marko Mäkelä
cee37b5d26 Merge 10.6 into 10.7 2021-07-22 11:22:09 +03:00
Marko Mäkelä
641f09398f Merge 10.5 into 10.6 2021-07-22 10:11:08 +03:00
Marko Mäkelä
82d5994520 MDEV-26110: Do not rely on alignment on static allocation
It is implementation-defined whether alignment requirements
that are larger than std::max_align_t (typically 8 or 16 bytes)
will be honored by the compiler and linker.

It turns out that on IBM AIX, both alignas() and MY_ALIGNED()
only guarantees alignment up to 16 bytes.

For some data structures, specifying alignment to the CPU
cache line size (typically 64 or 128 bytes) is a mere performance
optimization, and we do not really care whether the requested
alignment is guaranteed.

But, for the correct operation of direct I/O, we do require that
the buffers be aligned at a block size boundary.

field_ref_zero: Define as a pointer, not an array.
For innochecksum, we can make this point to unaligned memory;
for anything else, we will allocate an aligned buffer from the heap.
This buffer will be used for overwriting freed data pages when
innodb_immediate_scrub_data_uncompressed=ON. And exactly that code
hit an assertion failure on AIX, in the test innodb.innodb_scrub.

log_sys.checkpoint_buf: Define as a pointer to aligned memory
that is allocated from heap.

log_t::file::write_header_durable(): Reuse log_sys.checkpoint_buf
instead of trying to allocate an aligned buffer from the stack.
2021-07-22 10:05:13 +03:00
Sergei Golubchik
6190a02f35 Merge branch '10.2' into 10.3 2021-07-21 20:11:07 +02:00
Heinz Wiesinger
751ebe44fd Add feature summary at the end of cmake.
This gives a short overview over found/missing dependencies as well
as enabled/disabled features.

Initial author Heinz Wiesinger <heinz@m2mobi.com>
Additions by Vicențiu Ciorbaru <vicentiu@mariadb.org>
* Report all plugins enabled via MYSQL_ADD_PLUGIN
* Simplify code. Eliminate duplication by making use of WITH_xxx
  variable values to set feature "ON" / "OFF" state.

Reviewed by: wlad@mariadb.com (code details) serg@mariadb.com (the idea)
2021-07-21 10:22:56 +03:00
Marko Mäkelä
ed0a7b1b3f MDEV-24626 fixup: Remove useless code
fil_ibd_create(): Remove code that should have been removed in
commit 86dc7b4d4c already.
We no longer wrote an initialized page to the file, but we would
still allocate a page image in memory and write it.

xb_space_create_file(): Remove an unnecessary page write.
(This is a functional change for Mariabackup.)
2021-07-20 17:35:03 +03:00
Oli Sennhauser
da495b1b69 Typo fix in extrabackup.cc and innobackupex.cc
Thanks to @shinguz for helping with this.
This a backport commit from 10.7
2021-07-15 10:42:54 +03:00
Oli Sennhauser
2c4d1fb544 Typo fix in extrabackup.cc and innobackupex.cc
Thanks to @shinguz for helping with this.
2021-07-12 13:29:55 +03:00
Marko Mäkelä
f778a5d5e2 MDEV-25854: Remove garbage tables after restoring a backup
In commit 1c5ae99194 (MDEV-25666)
we had changed Mariabackup so that it would no longer skip files
whose names start with #sql. This turned out to be wrong.
Because operations on such named files are not protected by any
locks in the server, it is not safe to copy them.

Not copying the files may make the InnoDB data dictionary
inconsistent with the file system. So, we must do something
in InnoDB to adjust for that.

If InnoDB is being started up without the redo log (ib_logfile0)
or with a zero-length log file, we will assume that the server
was restored from a backup, and adjust things as follows:

dict_check_sys_tables(), fil_ibd_open(): Do not complain about
missing #sql files if they would be dropped a little later.

dict_stats_update_if_needed(): Never add #sql tables to
the recomputing queue. This avoids a potential race condition when
dropping the garbage tables.

drop_garbage_tables_after_restore(): Try to drop any garbage tables.

innodb_ddl_recovery_done(): Invoke drop_garbage_tables_after_restore()
if srv_start_after_restore (a new flag) was set and we are not in
read-only mode (innodb_read_only=ON or innodb_force_recovery>3).

The tests and dbug_mariabackup_event() instrumentation
were developed by Vladislav Vaintroub, who also reviewed this.
2021-06-17 13:46:16 +03:00
Marko Mäkelä
6ba938af62 MDEV-25905: Assertion table2==NULL in dict_sys_t::add()
In commit 49e2c8f0a6 (MDEV-25743)
we made dict_sys_t::find() incompatible with the rest of the
table name hash table operations in case the table name contains
non-ASCII octets (using a compatibility mode that facilitates the
upgrade into the MySQL 5.0 filename-safe encoding) and the target
platform implements signed char.

ut_fold_string(): Remove; replace with my_crc32c(). This also makes
table name hash value calculations independent on whether char
is unsigned or signed.
2021-06-14 12:38:56 +03:00
Vladislav Vaintroub
3d6eb7afcf MDEV-25602 get rid of __WIN__ in favor of standard _WIN32
This fixed the MySQL bug# 20338 about misuse of double underscore
prefix __WIN__, which was old MySQL's idea of identifying Windows
Replace it by _WIN32 standard symbol for targeting Windows OS
(both 32 and 64 bit)

Not that connect storage engine is not fixed in this patch (must be
fixed in "upstream" branch)
2021-06-06 13:21:03 +02:00
Marko Mäkelä
a722ee88f3 Merge 10.5 into 10.6 2021-06-01 11:39:38 +03:00
Marko Mäkelä
9c7a456a92 Merge 10.4 into 10.5 2021-06-01 10:38:09 +03:00
Marko Mäkelä
77d8da57d7 Merge 10.3 into 10.4 2021-06-01 09:14:59 +03:00
Marko Mäkelä
950a220060 Merge 10.2 into 10.3 2021-06-01 08:40:59 +03:00
Vladislav Vaintroub
d3c77e08ae MDEV-20556 Remove references to "xtrabackup" and "innobackupex" in mariabackup --help 2021-05-31 12:54:21 +02:00
Vladislav Vaintroub
5bd517259f MDEV-25815 mariabackup crash or debug assert with --backup --databases-exclude
Fix regression (debug assertion or division by 0)
caused by cfd3d70ccb
2021-05-29 06:32:40 +02:00
Rucha Deodhar
4e19539c14 MDEV-22189: Change error messages inside code to have mariadb instead of
mysql

Fix: Changed error messages, rerecorded results and changed other relevant
files.
2021-05-24 11:38:13 +05:30
Marko Mäkelä
49e2c8f0a6 MDEV-25743: Unnecessary copying of table names in InnoDB dictionary
Many InnoDB data dictionary cache operations require that the
table name be copied so that it will be NUL terminated.
(For example, SYS_TABLES.NAME is not guaranteed to be NUL-terminated.)

dict_table_t::is_garbage_name(): Check if a name belongs to
the background drop table queue.

dict_check_if_system_table_exists(): Remove.

dict_sys_t::load_sys_tables(): Load the non-hard-coded system tables
SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL on startup.

dict_sys_t::create_or_check_sys_tables(): Replaces
dict_create_or_check_foreign_constraint_tables() and
dict_create_or_check_sys_virtual().

dict_sys_t::load_table(): Replaces dict_table_get_low()
and dict_load_table().

dict_sys_t::find_table(): Renamed from get_table().

dict_sys_t::sys_tables_exist(): Check whether all the non-hard-coded
tables SYS_FOREIGN, SYS_FOREIGN_COLS, SYS_VIRTUAL exist.

trx_t::has_stats_table_lock(): Moved to dict0stats.cc.

Some error messages will now report table names in the internal
databasename/tablename format, instead of `databasename`.`tablename`.
2021-05-21 18:03:40 +03:00
Marko Mäkelä
5d495fc44b Merge 10.5 into 10.6 2021-05-19 09:15:54 +03:00
Marko Mäkelä
db8fb40824 Merge 10.4 into 10.5 2021-05-19 08:39:39 +03:00
Marko Mäkelä
c366845a0b MDEV-25691: Simplify handlerton::drop_database for InnoDB
The implementation of handlerton::drop_database in InnoDB is
unnecessarily complex. The minimal implementation should check
that no conflicting locks or references exist on the tables,
delete all table metadata in a single transaction, and finally
delete the tablespaces.

Note: DROP DATABASE will delete each individual table that the
SQL layer knows about, one table per transaction.
The handlerton::drop_database is basically a final cleanup step
for removing any garbage that could have been left behind
in InnoDB due to some bug, or not having atomic DDL in the past.

hash_node_t: Remove. Use the proper data type name in pointers.

dict_drop_index_tree(): Do not take the table as a parameter.
Instead, return the tablespace ID if the tablespace should be dropped
(we are dropping a clustered index tree).

fil_delete_tablespace(), fil_system_t::detach(): Return a single
detached file handle. Multi-file tablespaces cannot be deleted
via this interface.

ha_innobase::delete_table(): Remove a work-around for non-atomic DDL
and do not try to drop tables with similar-looking name.

innodb_drop_database(): Complete rewrite.

innobase_drop_database(), dict_get_first_table_name_in_db(),
row_drop_database_for_mysql(), drop_all_foreign_keys_in_db(): Remove.

row_purge_remove_clust_if_poss_low(), row_undo_ins_remove_clust_rec():
If the tablespace is to be deleted, try to evict the table definition
from the cache. Failing that, set dict_table_t::space to nullptr.

lock_release_on_rollback(): On the rollback of CREATE TABLE, release all
locks that the transaction had on the table, to avoid heap-use-after-free.
2021-05-18 12:53:40 +03:00
Marko Mäkelä
08b6fd9395 MDEV-25710: Dead code os_file_opendir() in the server
The functions fil_file_readdir_next_file(), os_file_opendir(),
os_file_closedir() became dead code in the server in MariaDB 10.4.0
with commit 09af00cbde (the removal of
the crash recovery logic for the TRUNCATE TABLE implementation that
was replaced in MDEV-13564).

os_file_opendir(), os_file_closedir(): Define as macros.
2021-05-18 12:13:18 +03:00
Marko Mäkelä
86dc7b4d4c MDEV-24626 Remove synchronous write of page0 file during file creation
During data file creation, InnoDB holds dict_sys mutex, tries to
write page 0 of the file and flushes the file. This not only causing
unnecessary contention but also a deviation from the write-ahead
logging protocol.

The clean sequence of operations is that we first start a dictionary
transaction and write SYS_TABLES and SYS_INDEXES records that identify
the tablespace. Then, we durably write a FILE_CREATE record to the
write-ahead log and create the file.

Recovery should not unnecessarily insist that the first page of each
data file that is referred to by the redo log is valid. It must be
enough that page 0 of the tablespace can be initialized based on the
redo log contents.

We introduce a new data structure deferred_spaces that keeps track
of corrupted-looking files during recovery. The data structure holds
the last LSN of a FILE_ record referring to the data file, the
tablespace identifier, and the last known file name.

There are two scenarios can happen during recovery:
i) Sufficient memory: InnoDB can reconstruct the
tablespace after parsing all redo log records.

ii) Insufficient memory(multiple apply phase): InnoDB should
store the deferred tablespace redo logs even though
tablespace is not present. InnoDB should start constructing
the tablespace when it first encounters deferred tablespace
id.

Mariabackup copies the zero filled ibd file in backup_fix_ddl() as
the extension of .new file. Mariabackup test case does page flushing
when it deals with DDL operation during backup operation.

fil_ibd_create(): Remove the write of page0 and flushing of file

fil_ibd_load(): Return FIL_LOAD_DEFER if the tablespace has
zero filled page0

Datafile: Clean up the error handling, and do not report errors
if we are in the middle of recovery. The caller will check
Datafile::m_defer.

fil_node_t::deferred: Indicates whether the tablespace loading was
deferred during recovery

FIL_LOAD_DEFER: Returned by fil_ibd_load() to indicate that tablespace
file was cannot be loaded.

recv_sys_t::recover_deferred(): Invoke deferred_spaces.create() to
initialize fil_space_t based on buffered metadata and records to
initialize page 0. Ignore the flags in fil_name_t, because they are
intentionally invalid.

fil_name_process(): Update deferred_spaces.

recv_sys_t::parse(): Store the redo log if the tablespace id
is present in deferred spaces

recv_sys_t::recover_low(): Should recover the first page of
the tablespace even though the tablespace instance is not
present

recv_sys_t::apply(): Initialize the deferred tablespace
before applying the deferred tablespace records

recv_validate_tablespace(): Skip the validation for deferred_spaces.

recv_rename_files(): Moved and revised from recv_sys_t::apply().
For deferred-recovery tablespaces, do not attempt to rename the
file if a deferred-recovery tablespace is associated with the name.

recv_recovery_from_checkpoint_start(): Invoke recv_rename_files()
and initialize all deferred tablespaces before applying redo log.

fil_node_t::read_page0(): Skip page0 validation if the tablespace
is deferred

buf_page_create_deferred(): A variant of buf_page_create() when
the fil_space_t is not available yet

This is joint work with Thirunarayanan Balathandayuthapani,
who implemented an initial prototype.
2021-05-17 18:12:33 +03:00
Marko Mäkelä
1c5ae99194 MDEV-25666: After backup, "Could not find a valid tablespace file"
Ever since MDEV-18518 made DDL operations mostly crash-safe inside InnoDB,
it became obvious that Mariabackup might not be entirely safe with regard to
concurrent DDL operations.

check_if_skip_table(): Do not skip files whose name starts with #sql.
We cannot know whether a DDL operation is in progress and the table
might in fact be needed later.
2021-05-14 08:36:14 +03:00
Marko Mäkelä
a0b133f431 MDEV-25312 fixup: Invoke fil_space_t::rename() correctly 2021-05-14 08:24:43 +03:00
Marko Mäkelä
916b237b3f Merge 10.5 into 10.6 2021-05-07 15:00:27 +03:00
Vladislav Vaintroub
a5b3982585 MDEV-25613 assertion (file_system.n_open > 0) failed
Remove operations on fil_system.n_open from mariabackup, as they are not
protected by the mutex, and serve no higher purpose anyway.
2021-05-07 08:13:28 +02:00
Marko Mäkelä
d2e2d32933 Merge 10.5 into 10.6 2021-04-14 12:32:27 +03:00
Marko Mäkelä
6c3e860cbf Merge 10.4 into 10.5 2021-04-14 11:35:39 +03:00
Marko Mäkelä
5008171b05 Merge 10.3 into 10.4 2021-04-14 10:33:59 +03:00
Marko Mäkelä
6e6318b29b Merge 10.2 into 10.3 2021-04-13 10:26:01 +03:00
Julius Goryavsky
3eecb8db22 MDEV-25356: SST scripts should use the new mariabackup interface
SST scripts for Galera should use the new mariabackup interface
instead of the innobackupex interface, which is currently only
supported for compatibility reasons.

This commit converts the SST script for mariabackup to use the
new interface. It does not need separate tests, as any problems
will be seen as failures when running multiple tests for the
mariabackup-based SST.
2021-04-11 17:07:36 +02:00
Julius Goryavsky
bf1e09e0c4 Removed extra spaces in generated command lines (minor "cosmetic"
change after MDEV-24197)
2021-04-11 02:27:03 +02:00
Julius Goryavsky
8ff0ac45dc MDEV-25328: --innodb command line option causes mariabackup to fail
This patch fixes an issue with launching mariabackup during SST
(when used with Galera), when during bootstrap mariabackup receives
the "--innodb" option, which is incorrectly interpreted as shortcut
for "--innodb-force-recovery". This patch does not require separate
test for mtr, as the problem is visible in general testing on
buildbot.
2021-04-11 02:26:52 +02:00
Marko Mäkelä
450c017c2d Merge 10.2 into 10.3 2021-04-09 14:32:06 +03:00
Marko Mäkelä
cf552f5886 MDEV-25312 Replace fil_space_t::name with fil_space_t::name()
A consistency check for fil_space_t::name is causing recovery failures
in MDEV-25180 (Atomic ALTER TABLE). So, we'd better remove that field
altogether.

fil_space_t::name was more or less a copy of dict_table_t::name
(except for some special cases), and it was not being used for
anything useful.

There used to be a name_hash, but it had been removed already in
commit a75dbfd718 (MDEV-12266).

We will also remove os_normalize_path(), OS_PATH_SEPARATOR,
OS_PATH_SEPATOR_ALT. On Microsoft Windows, we will treat \ and /
roughly in the same way. The intention is that for per-table
tablespaces, the filenames will always follow the pattern
prefix/databasename/tablename.ibd. (Any \ in the prefix must not
be converted.)

ut_basename_noext(): Remove (unused function).

read_link_file(): Replaces RemoteDatafile::read_link_file().
We will ensure that the last two path component separators are
forward slashes (converting up to 2 trailing backslashes on
Microsoft Windows), so that everywhere else we can
assume that data file names end in "/databasename/tablename.ibd".

Note: On Microsoft Windows, path names that start with \\?\ must
not contain / as path component separators. Previously, such paths
did work in the DATA DIRECTORY argument of InnoDB tables.

Reviewed by: Vladislav Vaintroub
2021-04-07 18:01:13 +03:00
Julius Goryavsky
fb9d151934 MDEV-25321: mariabackup failed if password is passed via environment variable
The mariabackup interface currently supports passing a password
through an explicit command line variable, but does not support
passing a password through the MYSQL_PWD environment variable.
At the same time, the Galera SST script for mariabackup uses
the environment variable to pass the password, which leads
(in some cases) to an unsuccessful launch of mariabackup and
to the inability to start the cluster. This patch fixes this
issue. It does not need a separate test, as the problem is
visible in general testing on buildbot.
2021-04-01 21:47:30 +02:00
Srinidhi Kaushik
5bc5ecce08 MDEV-24197: Add "innodb_force_recovery" for "mariabackup --prepare"
During the prepare phase of restoring backups, "mariabackup" does
not seem to allow (or recognize) the option "innodb_force_recovery"
for the embedded InnoDB server instance that it starts.

If page corruption observed during page recovery, the prepare step
fails. While this is indeed the correct behavior ideally, allowing
this option to be set in case of emergencies might be useful when
the current backup is the only copy available. Some error messages
during "--prepare" suggest to set "innodb_force_recovery" to 1:

  [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.

For backwards compatibility, "mariabackup --innobackupex --apply-log"
should also have this option.

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
2021-04-01 13:34:40 +03:00
Vladislav Vaintroub
08cb5d8483 MDEV-25221 Do not remove source file, if copy_file() fails in mariabackup --move-back
Remove an incompletely copied destination file.
2021-03-31 14:23:56 +02:00
Eugene Kosov
1274bb7729 MDEV-25223 change fil_system_t::space_list and fil_system_t::named_spaces from UT_LIST to ilist
Mostly a refactoring. Also, debug functions were added for ease of life while debugging
2021-03-24 15:15:18 +03:00
Marko Mäkelä
1055cf967f Merge 10.5 into 10.6 2021-03-20 18:40:07 +02:00
David Carlier
1bacab8ab9 mariabackup little FreeBSD update support.
In this platform, it s better not to rely on optional proc filesystem presence.
 Using native API to retrieve binary absolute path instead.
2021-03-20 15:29:56 +11:00
Eugene Kosov
dbe941e06f cleanup: os_thread_create -> std::thread 2021-03-19 11:44:28 +03:00
Eugene Kosov
62e4aaa240 cleanup: os_thread_sleep() -> std::this_thread::sleep_for()
std version has an advantage of a more convenient units implementation from
std::chrono. Now it's no need to multipy/divide to bring anything to
micro seconds.
2021-03-19 11:44:03 +03:00
Marko Mäkelä
783625d78f MDEV-24883 add io_uring support for tpool
liburing is a new optional dependency (WITH_URING=auto|yes|no)
that replaces libaio when it is available.

aio_uring: class which wraps io_uring stuff

aio_uring::bind()/unbind(): optional optimization

aio_uring::submit_io(): mutex prevents data race. liburing calls are
thread-unsafe. But if you look into it's implementation you'll see
atomic operations. They're used for synchronization between kernel and
user-space only. That's why our own synchronization is still needed.

For systemd, we add LimitMEMLOCK=524288 (ulimit -l 524288)
because the io_uring_setup system call that is invoked
by io_uring_queue_init() requests locked memory. The value
was found empirically; with 262144, we would occasionally
fail to enable io_uring when using the maximum values of
innodb_read_io_threads=64 and innodb_write_io_threads=64.

aio_uring::thread_routine(): Tolerate -EINTR return from
io_uring_wait_cqe(), because it may occur on shutdown
on Ubuntu 20.10 (Groovy Gorilla).

This was mostly implemented by Eugene Kosov. Systemd integration
and improved startup/shutdown error handling by Marko Mäkelä.
2021-03-15 11:30:17 +02:00
Marko Mäkelä
a43ff483fa Merge 10.5 into 10.6 2021-03-11 20:20:07 +02:00
Marko Mäkelä
a4b7232b2c Merge 10.4 into 10.5 2021-03-11 20:09:34 +02:00
Marko Mäkelä
7a4fbb55b0 MDEV-25105 Remove innodb_checksum_algorithm values none,innodb,...
Historically, InnoDB supported a buggy page checksum algorithm that did not
compute a checksum over the full page. Later, well before MySQL 4.1
introduced .ibd files and the innodb_file_per_table option, the algorithm
was corrected and the first 4 bytes of each page were redefined to be
a checksum.

The original checksum was so slow that an option to disable page checksum
was introduced for benchmarketing purposes.

The Intel Nehalem microarchitecture introduced the SSE4.2 instruction set
extension, which includes instructions for faster computation of CRC-32C.
In MySQL 5.6 (and MariaDB 10.0), innodb_checksum_algorithm=crc32 was
implemented to make of that. As that option was changed to be the default
in MySQL 5.7, a bug was found on big-endian platforms and some work-around
code was added to weaken that checksum further. MariaDB disables that
work-around by default since MDEV-17958.

Later, SIMD-accelerated CRC-32C has been implemented in MariaDB for POWER
and ARM and also for IA-32/AMD64, making use of carry-less multiplication
where available.

Long story short, innodb_checksum_algorithm=crc32 is faster and more secure
than the pre-MySQL 5.6 checksum, called innodb_checksum_algorithm=innodb.
It should have removed any need to use innodb_checksum_algorithm=none.

The setting innodb_checksum_algorithm=crc32 is the default in
MySQL 5.7 and MariaDB Server 10.2, 10.3, 10.4. In MariaDB 10.5,
MDEV-19534 made innodb_checksum_algorithm=full_crc32 the default.
It is even faster and more secure.

The default settings in MariaDB do allow old data files to be read,
no matter if a worse checksum algorithm had been used.
(Unfortunately, before innodb_checksum_algorithm=full_crc32,
the data files did not identify which checksum algorithm is being used.)

The non-default settings innodb_checksum_algorithm=strict_crc32 or
innodb_checksum_algorithm=strict_full_crc32 would only allow CRC-32C
checksums. The incompatibility with old data files is why they are
not the default.

The newest server not to support innodb_checksum_algorithm=crc32
were MySQL 5.5 and MariaDB 5.5. Both have reached their end of life.
A valid reason for using innodb_checksum_algorithm=innodb could have
been the ability to downgrade. If it is really needed, data files
can be converted with an older version of the innochecksum utility.

Because there is no good reason to allow data files to be written
with insecure checksums, we will reject those option values:

    innodb_checksum_algorithm=none
    innodb_checksum_algorithm=innodb
    innodb_checksum_algorithm=strict_none
    innodb_checksum_algorithm=strict_innodb

Furthermore, the following innochecksum options will be removed,
because only strict crc32 will be supported:

    innochecksum --strict-check=crc32
    innochecksum -C crc32
    innochecksum --write=crc32
    innochecksum -w crc32

If a user wishes to convert a data file to use a different checksum
(so that it might be used with the no-longer-supported
MySQL 5.5 or MariaDB 5.5, which do not support IMPORT TABLESPACE
nor system tablespace format changes that were made in MariaDB 10.3),
then the innochecksum tool from MariaDB 10.2, 10.3, 10.4, 10.5 or
MySQL 5.7 can be used.

Reviewed by: Thirunarayanan Balathandayuthapani
2021-03-11 12:46:18 +02:00
David CARLIER
1dff411e84 arguments overflow fix proposal. the list is assumed to be implictly null terminated at usage time. 2021-03-09 15:51:38 +02:00
David CARLIER
e3a597378e mariabackup utility, binary path implementation for Mac.
implements in a native way get_exepath which gives reliably the full
path.
2021-03-09 15:51:38 +02:00
Marko Mäkelä
d346763479 Merge 10.5 into 10.6 2021-03-08 10:51:31 +02:00
Marko Mäkelä
a5d3c1c819 Merge 10.4 into 10.5 2021-03-08 10:16:20 +02:00
Marko Mäkelä
a26e7a3726 Merge 10.3 into 10.4 2021-03-08 09:39:54 +02:00
Marko Mäkelä
bcd160753c Merge 10.2 into 10.3 2021-03-05 10:06:42 +02:00
Vladislav Vaintroub
545cba13eb MDEV-22929 fixup. Print "completed OK!" if page corruption and --log-innodb-page-corruption
Since we do not stop at corrupted page error, there is no reason to log
a backup error.
2021-03-05 09:04:30 +01:00
Marko Mäkelä
420f8e24ab MDEV-24854: Change innodb_flush_method=O_DIRECT by default
We have innodb_use_native_aio=ON by default since the introduction of
that parameter in commit 2f9fb41b05
(MySQL 5.5 and MariaDB 5.5).

However, to really benefit from the setting, the files should be
opened in O_DIRECT mode, to bypass the file system cache.
In this way, the reads and writes can be submitted with DMA, using
the InnoDB buffer pool directly, and no processor cycles need to be
used for copying data. The use of O_DIRECT benefits not only the
current libaio implementation, but also liburing.

os_file_set_nocache(): Test innodb_flush_method in the function,
not in the callers.
2021-02-20 11:58:58 +02:00
Marko Mäkelä
b19ec8848c Merge 10.5 into 10.6 2021-02-11 09:26:53 +02:00
Monty
5d6ad2ad66 Added 'const' to arguments in get_one_option and find_typeset()
One should not change the program arguments!
This change also reduces warnings from the icc compiler.

Almost all changes are just syntax changes (adding const to
'get_one_option function' declarations).

Other changes:
- Added a few cast of 'argument' from 'const char*' to 'char *'. This
  was mainly in calls to 'external' functions we don't have control of.
- Ensure that all reset of 'password command line argument' are similar.
  (In almost all cases it was just adding a comment and a cast)
- In mysqlbinlog.cc and mysqld.cc there was a few cases that changed
  the command line argument. These places where changed to instead allocate
  the option in a MEM_ROOT to avoid changing the argument. Some of this
  code was changed to ensure that different programs did parsing the
  same way. Added a test case for the changes in mysqlbinlog.cc
- Changed a few variables that took their value from command line options
  from 'char *' to 'const char *'.
2021-02-08 12:16:29 +02:00
Marko Mäkelä
92abdcca5a Merge 10.5 into 10.6 2021-01-07 09:08:09 +02:00
Marko Mäkelä
a993310593 MDEV-24537 innodb_max_dirty_pages_pct_lwm=0 lost its special meaning
In commit 3a9a3be1c6 (MDEV-23855)
some previous logic was replaced with the condition
dirty_pct < srv_max_dirty_pages_pct_lwm, which caused
the default value of the parameter innodb_max_dirty_pages_pct_lwm=0
to lose its special meaning: 'refer to innodb_max_dirty_pages_pct instead'.

This implicit special meaning was visible in the function
af_get_pct_for_dirty(), which was removed in
commit f0c295e2de (MDEV-24369).

page_cleaner_flush_pages_recommendation(): Restore the special
meaning that was removed in MDEV-24369.

buf_flush_page_cleaner(): If srv_max_dirty_pages_pct_lwm==0.0,
refer to srv_max_buf_pool_modified_pct. This fixes the observed
performance regression due to excessive page flushing.

buf_pool_t::page_cleaner_wakeup(): Revise the wakeup condition.

innodb_init(): Do initialize srv_max_io_capacity in Mariabackup.
It was previously constantly 0, which caused mariadb-backup --prepare
to hang in buf_flush_sync(), making no progress.
2021-01-06 13:53:14 +02:00
Oleksandr Byelkin
02e7bff882 Merge commit '10.4' into 10.5 2021-01-06 10:53:00 +01:00
Oleksandr Byelkin
478b83032b Merge branch '10.3' into 10.4 2020-12-25 09:13:28 +01:00
Oleksandr Byelkin
25561435e0 Merge branch '10.2' into 10.3 2020-12-23 19:28:02 +01:00
Marko Mäkelä
0aa02567dd Merge 10.3 into 10.4 2020-12-23 14:52:59 +02:00
Marko Mäkelä
fa1aef39eb Merge 10.2 into 10.3 2020-12-23 14:25:45 +02:00
Vlad Lesin
719da2c4cc MDEV-22810 mariabackup does not honor open_files_limit from option during backup prepare
open_files_limit option was processed only for --backup, but not for
--prepare.
2020-12-16 10:23:41 +03:00
Marko Mäkelä
ff5d306e29 MDEV-21452: Replace ib_mutex_t with mysql_mutex_t
SHOW ENGINE INNODB MUTEX functionality is completely removed,
as are the InnoDB latching order checks.

We will enforce innodb_fatal_semaphore_wait_threshold
only for dict_sys.mutex and lock_sys.mutex.

dict_sys_t::mutex_lock(): A single entry point for dict_sys.mutex.

lock_sys_t::mutex_lock(): A single entry point for lock_sys.mutex.

FIXME: srv_sys should be removed altogether; it is duplicating tpool
functionality.

fil_crypt_threads_init(): To prevent SAFE_MUTEX warnings, we must
not hold fil_system.mutex.

fil_close_all_files(): To prevent SAFE_MUTEX warnings for
fil_space_destroy_crypt_data(), we must not hold fil_system.mutex
while invoking fil_space_free_low() on a detached tablespace.
2020-12-15 17:56:18 +02:00
Marko Mäkelä
38fd7b7d91 MDEV-21452: Replace all direct use of os_event_t
Let us replace os_event_t with mysql_cond_t, and replace the
necessary ib_mutex_t with mysql_mutex_t so that they can be
used with condition variables.

Also, let us replace polling (os_thread_sleep() or timed waits)
with plain mysql_cond_wait() wherever possible.

Furthermore, we will use the lightweight srw_mutex for trx_t::mutex,
to hopefully reduce contention on lock_sys.mutex.

FIXME: Add test coverage of
mariabackup --backup --kill-long-queries-timeout
2020-12-15 17:56:17 +02:00
Marko Mäkelä
9ecd766526 Merge 10.5 into 10.6 2020-12-14 18:02:40 +02:00
Marko Mäkelä
f24b738318 MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1
In commit 5e62b6a5e0 (MDEV-16264)
the logic of os_aio_init() was changed so that it will never fail,
but instead automatically disable innodb_use_native_aio (which is
enabled by default) if the io_setup() system call would fail due
to resource limits being exceeded. This is questionable, especially
because falling back to simulated AIO may lead to significantly
reduced performance.

srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads:
Change the data type from ulong to uint.

os_aio_init(): Remove the parameters, and actually return an error code.

thread_pool::configure_aio(): Do not silently fall back to simulated AIO.

Reviewed by: Vladislav Vaintroub
2020-12-14 15:27:03 +02:00
Marko Mäkelä
8677c14e65 MDEV-24391 heap-use-after-free in fil_space_t::flush_low()
We observed a race condition that involved two threads
executing fil_flush_file_spaces() and one thread
executing fil_delete_tablespace(). After one of the
fil_flush_file_spaces() observed that
space.needs_flush_not_stopping() is set and was
releasing the fil_system.mutex, the other fil_flush_file_spaces()
would complete the execution of fil_space_t::flush_low() on
the same tablespace. Then, fil_delete_tablespace() would
destroy the object, because the value of fil_space_t::n_pending
did not prevent that. Finally, the fil_flush_file_spaces() would
resume execution and invoke fil_space_t::flush_low() on the freed
object.

This race condition was introduced in
commit 118e258aaa of MDEV-23855.

fil_space_t::flush(): Add a template parameter that indicates
whether the caller is holding a reference to prevent the
tablespace from being freed.

buf_dblwr_t::flush_buffered_writes_completed(),
row_quiesce_table_start(): Acquire a reference for the duration
of the fil_space_t::flush_low() operation. It should be impossible
for the object to be freed in these code paths, but we want to
satisfy the debug assertions.

fil_space_t::flush_low(): Do not increment or decrement the
reference count, but instead assert that the caller is holding
a reference.

fil_space_extend_must_retry(), fil_flush_file_spaces():
Acquire a reference before releasing fil_system.mutex.
This is what will fix the race condition.
2020-12-11 09:05:26 +02:00
Marko Mäkelä
1eb59c307d MDEV-24340 Unique final message of InnoDB during shutdown
innobase_space_shutdown(): Remove. We want this step to be executed
before the message "InnoDB: Shutdown completed; log sequence number "
is output by innodb_shutdown(). It used to be executed after that step.

innodb_shutdown(): Duplicate the code that used to live in
innobase_space_shutdown().

innobase_init_abort(): Merge with innobase_space_shutdown().
2020-12-04 11:46:47 +02:00
Marko Mäkelä
a13fac9eee Merge 10.5 into 10.6 2020-12-03 08:12:47 +02:00
Marko Mäkelä
6a1e655cb0 Merge 10.4 into 10.5 2020-12-02 18:29:49 +02:00
Marko Mäkelä
589cf8dbf3 Merge 10.3 into 10.4 2020-12-01 19:51:14 +02:00
Vlad Lesin
e30a05f454 MDEV-22929 MariaBackup option to report and/or continue when corruption is encountered
Post-push Windows compilation errors fix.
2020-12-01 18:33:10 +03:00
Marko Mäkelä
81ab9ea63f Merge 10.2 into 10.3 2020-12-01 14:55:46 +02:00
Vlad Lesin
e6b3e38d62 MDEV-22929 MariaBackup option to report and/or continue when corruption is encountered
The new option --log-innodb-page-corruption is introduced.

When this option is set, backup is not interrupted if innodb corrupted
page is detected. Instead it logs all found corrupted pages in
innodb_corrupted_pages file in backup directory and finishes with error.

For incremental backup corrupted pages are also copied to .delta file,
because we can't do LSN check for such pages during backup,
innodb_corrupted_pages will also be created in incremental backup
directory.

During --prepare, corrupted pages list is read from the file just after
redo log is applied, and each page from the list is checked if it is allocated
in it's tablespace or not. If it is not allocated, then it is zeroed out,
flushed to the tablespace and removed from the list. If all pages are removed
from the list, then --prepare is finished successfully and
innodb_corrupted_pages file is removed from backup directory. Otherwise
--prepare is finished with error message and innodb_corrupted_pages contains
the list of the pages, which are detected as corrupted during backup, and are
allocated in their tablespaces, what means backup directory contains corrupted
innodb pages, and backup can not be considered as consistent.

For incremental --prepare corrupted pages from .delta files are applied
to the base backup, innodb_corrupted_pages is read from both base in
incremental directories, and the same action is proceded for corrupted
pages list as for full --prepare. innodb_corrupted_pages file is
modified or removed only in base directory.

If DDL happens during backup, it is also processed at the end of backup
to have correct tablespace names in innodb_corrupted_pages.
2020-12-01 08:08:57 +03:00
Marko Mäkelä
533a13af06 Merge 10.3 into 10.4 2020-11-03 14:49:17 +02:00
Marko Mäkelä
09a1f0075a Merge 10.5 into 10.6 2020-11-02 12:49:19 +02:00
Oleksandr Byelkin
8e1e2856f2 Merge branch '10.4' into 10.5 2020-11-01 14:26:15 +01:00
Oleksandr Byelkin
80c951ce28 Merge branch '10.3' into 10.4 2020-10-31 21:06:49 +01:00
Oleksandr Byelkin
794f665139 Merge branch '10.2' into 10.3 2020-10-30 17:23:53 +01:00
Marko Mäkelä
898521e2dd Merge 10.4 into 10.5 2020-10-30 11:15:30 +02:00
Marko Mäkelä
7b2bb67113 Merge 10.3 into 10.4 2020-10-29 13:38:38 +02:00
Vlad Lesin
6cb88685c4 MDEV-24026: InnoDB: Failing assertion: os_total_large_mem_allocated >= size upon incremental backup
mariabackup deallocated uninitialized
write_filt_ctxt.u.wf_incremental_ctxt in xtrabackup_copy_datafile() when
some table should be skipped due to parsed DDL redo log record.
2020-10-29 07:39:43 +01:00
Marko Mäkelä
a8de8f261d Merge 10.2 into 10.3 2020-10-28 10:01:50 +02:00
Marko Mäkelä
c27e53f459 MDEV-23855: Use normal mutex for log_sys.mutex, log_sys.flush_order_mutex
With an unreasonably small innodb_log_file_size, the page cleaner
thread would frequently acquire log_sys.flush_order_mutex and spend
a significant portion of CPU time spinning on that mutex when
determining the checkpoint LSN.
2020-10-26 17:53:55 +02:00
Marko Mäkelä
118e258aaa MDEV-23855: Shrink fil_space_t
Merge n_pending_ios, n_pending_ops to std::atomic<uint32_t> n_pending.
Change some more fil_space_t members to uint32_t to reduce
the memory footprint.

fil_space_t::add(), fil_ibd_create(): Attach the already opened
handle to the tablespace, and enforce the fil_system.n_open limit.

dict_boot(): Initialize fil_system.max_assigned_id.

srv_boot(): Call srv_thread_pool_init() before anything else,
so that files should be opened in the correct mode on Windows.

fil_ibd_create(): Create the file in OS_FILE_AIO mode, just like
fil_node_open_file_low() does it.

dict_table_t::is_accessible(): Replaces fil_table_accessible().

Reviewed by: Vladislav Vaintroub
2020-10-26 17:53:54 +02:00
Marko Mäkelä
45ed9dd957 MDEV-23855: Remove fil_system.LRU and reduce fil_system.mutex contention
Also fixes MDEV-23929: innodb_flush_neighbors is not being ignored
for system tablespace on SSD

When the maximum configured number of file is exceeded, InnoDB will
close data files. We used to maintain a fil_system.LRU list and
a counter fil_node_t::n_pending to achieve this, at the huge cost
of multiple fil_system.mutex operations per I/O operation.

fil_node_open_file_low(): Implement a FIFO replacement policy:
The last opened file will be moved to the end of fil_system.space_list,
and files will be closed from the start of the list. However, we will
not move tablespaces in fil_system.space_list while
i_s_tablespaces_encryption_fill_table() is executing
(producing output for INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION)
because it may cause information of some tablespaces to go missing.
We also avoid this in mariabackup --backup because datafiles_iter_next()
assumes that the ordering is not changed.

IORequest: Fold more parameters to IORequest::type.

fil_space_t::io(): Replaces fil_io().

fil_space_t::flush(): Replaces fil_flush().

OS_AIO_IBUF: Remove. We will always issue synchronous reads of the
change buffer pages in buf_read_page_low().

We will always ignore some errors for background reads.

This should reduce fil_system.mutex contention a little.

fil_node_t::complete_write(): Replaces fil_node_t::complete_io().
On both read and write completion, fil_space_t::release_for_io()
will have to be called.

fil_space_t::io(): Do not acquire fil_system.mutex in the normal
code path.

xb_delta_open_matching_space(): Do not try to open the system tablespace
which was already opened. This fixes a file sharing violation in
mariabackup --prepare --incremental.

Reviewed by: Vladislav Vaintroub
2020-10-26 17:09:01 +02:00
Marko Mäkelä
3a9a3be1c6 MDEV-23855: Improve InnoDB log checkpoint performance
After MDEV-15053, MDEV-22871, MDEV-23399 shifted the scalability
bottleneck, log checkpoints became a new bottleneck.

If innodb_io_capacity is set low or innodb_max_dirty_pct_lwm is
set high and the workload fits in the buffer pool, the page cleaner
thread will perform very little flushing. When we reach the capacity
of the circular redo log file ib_logfile0 and must initiate a checkpoint,
some 'furious flushing' will be necessary. (If innodb_flush_sync=OFF,
then flushing would continue at the innodb_io_capacity rate, and
writers would be throttled.)

We have the best chance of advancing the checkpoint LSN immediately
after a page flush batch has been completed. Hence, it is best to
perform checkpoints after every batch in the page cleaner thread,
attempting to run once per second.

By initiating high-priority flushing in the page cleaner as early
as possible, we aim to make the throughput more stable.

The function buf_flush_wait_flushed() used to sleep for 10ms, hoping
that the page cleaner thread would do something during that time.
The observed end result was that a large number of threads that call
log_free_check() would end up sleeping while nothing useful is happening.

We will revise the design so that in the default innodb_flush_sync=ON
mode, buf_flush_wait_flushed() will wake up the page cleaner thread
to perform the necessary flushing, and it will wait for a signal from
the page cleaner thread.

If innodb_io_capacity is set to a low value (causing the page cleaner to
throttle its work), a write workload would initially perform well, until
the capacity of the circular ib_logfile0 is reached and log_free_check()
will trigger checkpoints. At that point, the extra waiting in
buf_flush_wait_flushed() will start reducing throughput.

The page cleaner thread will also initiate log checkpoints after each
buf_flush_lists() call, because that is the best point of time for
the checkpoint LSN to advance by the maximum amount.

Even in 'furious flushing' mode we invoke buf_flush_lists() with
innodb_io_capacity_max pages at a time, and at the start of each
batch (in the log_flush() callback function that runs in a separate
task) we will invoke os_aio_wait_until_no_pending_writes(). This
tweak allows the checkpoint to advance in smaller steps and
significantly reduces the maximum latency. On an Intel Optane 960
NVMe SSD on Linux, it reduced from 4.6 seconds to 74 milliseconds.
On Microsoft Windows with a slower SSD, it reduced from more than
180 seconds to 0.6 seconds.

We will make innodb_adaptive_flushing=OFF simply flush innodb_io_capacity
per second whenever the dirty proportion of buffer pool pages exceeds
innodb_max_dirty_pages_pct_lwm. For innodb_adaptive_flushing=ON we try
to make page_cleaner_flush_pages_recommendation() more consistent and
predictable: if we are below innodb_adaptive_flushing_lwm, let us flush
pages according to the return value of af_get_pct_for_dirty().

innodb_max_dirty_pages_pct_lwm: Revert the change of the default value
that was made in MDEV-23399. The value innodb_max_dirty_pages_pct_lwm=0
guarantees that a shutdown of an idle server will be fast. Users might
be surprised if normal shutdown suddenly became slower when upgrading
within a GA release series.

innodb_checkpoint_usec: Remove. The master task will no longer perform
periodic log checkpoints. It is the duty of the page cleaner thread.

log_sys.max_modified_age: Remove. The current span of the
buf_pool.flush_list expressed in LSN only matters for adaptive
flushing (outside the 'furious flushing' condition).
For the correctness of checkpoints, the only thing that matters is
the checkpoint age (log_sys.lsn - log_sys.last_checkpoint_lsn).
This run-time constant was also reported as log_max_modified_age_sync.

log_sys.max_checkpoint_age_async: Remove. This does not serve any
purpose, because the checkpoints will now be triggered by the page
cleaner thread. We will retain the log_sys.max_checkpoint_age limit
for engaging 'furious flushing'.

page_cleaner.slot: Remove. It turns out that
page_cleaner_slot.flush_list_time was duplicating
page_cleaner.slot.flush_time and page_cleaner.slot.flush_list_pass
was duplicating page_cleaner.flush_pass.
Likewise, there were some redundant monitor counters, because the
page cleaner thread no longer performs any buf_pool.LRU flushing, and
because there only is one buf_flush_page_cleaner thread.

buf_flush_sync_lsn: Protect writes by buf_pool.flush_list_mutex.

buf_pool_t::get_oldest_modification(): Add a parameter to specify the
return value when no persistent data pages are dirty. Require the
caller to hold buf_pool.flush_list_mutex.

log_buf_pool_get_oldest_modification(): Take the fall-back LSN
as a parameter. All callers will also invoke log_sys.get_lsn().

log_preflush_pool_modified_pages(): Replaced with buf_flush_wait_flushed().

buf_flush_wait_flushed(): Implement two limits. If not enough buffer pool
has been flushed, signal the page cleaner (unless innodb_flush_sync=OFF)
and wait for the page cleaner to complete. If the page cleaner
thread is not running (which can be the case durign shutdown),
initiate the flush and wait for it directly.

buf_flush_ahead(): If innodb_flush_sync=ON (the default),
submit a new buf_flush_sync_lsn target for the page cleaner
but do not wait for the flushing to finish.

log_get_capacity(), log_get_max_modified_age_async(): Remove, to make
it easier to see that af_get_pct_for_lsn() is not acquiring any mutexes.

page_cleaner_flush_pages_recommendation(): Protect all access to
buf_pool.flush_list with buf_pool.flush_list_mutex. Previously there
were some race conditions in the calculation.

buf_flush_sync_for_checkpoint(): New function to process
buf_flush_sync_lsn in the page cleaner thread. At the end of
each batch, we try to wake up any blocked buf_flush_wait_flushed().
If everything up to buf_flush_sync_lsn has been flushed, we will
reset buf_flush_sync_lsn=0. The page cleaner thread will keep
'furious flushing' until the limit is reached. Any threads that
are waiting in buf_flush_wait_flushed() will be able to resume
as soon as their own limit has been satisfied.

buf_flush_page_cleaner: Prioritize buf_flush_sync_lsn and do not
sleep as long as it is set. Do not update any page_cleaner statistics
for this special mode of operation. In the normal mode
(buf_flush_sync_lsn is not set for innodb_flush_sync=ON),
try to wake up once per second. No longer check whether
srv_inc_activity_count() has been called. After each batch,
try to perform a log checkpoint, because the best chances for
the checkpoint LSN to advance by the maximum amount are upon
completing a flushing batch.

log_t: Move buf_free, max_buf_free possibly to the same cache line
with log_sys.mutex.

log_margin_checkpoint_age(): Simplify the logic, and replace
a 0.1-second sleep with a call to buf_flush_wait_flushed() to
initiate flushing. Moved to the same compilation unit
with the only caller.

log_close(): Clean up the calculations. (Should be no functional
change.) Return whether flush-ahead is needed. Moved to the same
compilation unit with the only caller.

mtr_t::finish_write(): Return whether flush-ahead is needed.

mtr_t::commit(): Invoke buf_flush_ahead() when needed. Let us avoid
external calls in mtr_t::commit() and make the logic easier to follow
by having related code in a single compilation unit. Also, we will
invoke srv_stats.log_write_requests.inc() only once per
mini-transaction commit, while not holding mutexes.

log_checkpoint_margin(): Only care about log_sys.max_checkpoint_age.
Upon reaching log_sys.max_checkpoint_age where we must wait to prevent
the log from getting corrupted, let us wait for at most 1MiB of LSN
at a time, before rechecking the condition. This should allow writers
to proceed even if the redo log capacity has been reached and
'furious flushing' is in progress. We no longer care about
log_sys.max_modified_age_sync or log_sys.max_modified_age_async.
The log_sys.max_modified_age_sync could be a relic from the time when
there was a srv_master_thread that wrote dirty pages to data files.
Also, we no longer have any log_sys.max_checkpoint_age_async limit,
because log checkpoints will now be triggered by the page cleaner
thread upon completing buf_flush_lists().

log_set_capacity(): Simplify the calculations of the limit
(no functional change).

log_checkpoint_low(): Split from log_checkpoint(). Moved to the
same compilation unit with the caller.

log_make_checkpoint(): Only wait for everything to be flushed until
the current LSN.

create_log_file(): After checkpoint, invoke log_write_up_to()
to ensure that the FILE_CHECKPOINT record has been written.
This avoids ut_ad(!srv_log_file_created) in create_log_file_rename().

srv_start(): Do not call recv_recovery_from_checkpoint_start()
if the log has just been created. Set fil_system.space_id_reuse_warned
before dict_boot() has been executed, and clear it after recovery
has finished.

dict_boot(): Initialize fil_system.max_assigned_id.

srv_check_activity(): Remove. The activity count is counting transaction
commits and therefore mostly interesting for the purge of history.

BtrBulk::insert(): Do not explicitly wake up the page cleaner,
but do invoke srv_inc_activity_count(), because that counter is
still being used in buf_load_throttle_if_needed() for some
heuristics. (It might be cleaner to execute buf_load() in the
page cleaner thread!)

Reviewed by: Vladislav Vaintroub
2020-10-26 17:09:01 +02:00
Marko Mäkelä
5999d5120e MDEV-23399 fixup: Avoid crash on Mariabackup shutdown
innodb_preshutdown(): Terminate the encryption threads before
the page cleaner thread can be shut down.

innodb_shutdown(): Always wait for the encryption threads and
page cleaner to shut down.

srv_shutdown_all_bg_threads(): Wait for the encryption threads and
the page cleaner to shut down. (After an aborted startup,
innodb_shutdown() would not be called.)

row_get_background_drop_list_len_low(): Remove.

os_thread_count: Remove. Alternatively, at the end of
srv_shutdown_all_bg_threads() we could try to wait longer
for the count to reach 0. On some platforms, an assertion
os_thread_count==0 could fail even after a small delay,
even though in the core dump all threads would have exited.

srv_shutdown_threads(): Renamed from srv_shutdown_all_bg_threads().
Do not wait for the page cleaner to shut down, because the later
innodb_shutdown(), which may invoke
logs_empty_and_mark_files_at_shutdown(), assumes that it exists.
2020-10-26 15:05:41 +02:00
Vlad Lesin
985ede9203 MDEV-20755 InnoDB: Database page corruption on disk or a failed file read of tablespace upon prepare of mariabackup incremental backup
The problem:

When incremental backup is taken, delta files are created for innodb tables
which are marked as new tables during innodb ddl tracking. When such
tablespace is tried to be opened during prepare in
xb_delta_open_matching_space(), it is "created", i.e.
xb_space_create_file() is invoked, instead of opening, even if
a tablespace with the same name exists in the base backup directory.

xb_space_create_file() writes page 0 header the tablespace.
This header does not contain crypt data, as mariabackup does not have
any information about crypt data in delta file metadata for
tablespaces.

After delta file is applied, recovery process is started. As the
sequence of recovery for different pages is not defined, there can be
the situation when crypt data redo log event is executed after some
other page is read for recovery. When some page is read for recovery, it's
decrypted using crypt data stored in tablespace header in page 0, if
there is no crypt data, the page is not decryped and does not pass corruption
test.

This causes error for incremental backup --prepare for encrypted
tablespaces.

The error is not stable because crypt data redo log event updates crypt
data on page 0, and recovery for different pages can be executed in
undefined order.

The fix:

When delta file is created, the corresponding write filter copies only
the pages which LSN is greater then some incremental LSN. When new file
is created during incremental backup, the LSN of all it's pages must be
greater then incremental LSN, so there is no need to create delta for
such table, we can just copy it completely.

The fix is to copy the whole file which was tracked during incremental backup
with innodb ddl tracker, and copy it to base directory during --prepare
instead of delta applying.

There is also DBUG_EXECUTE_IF() in innodb code to avoid writing redo log
record for crypt data updating on page 0 to make the test case stable.

Note:

The issue is not reproducible in 10.5 as optimized DDL's are deprecated
in 10.5. But the fix is still useful because it allows to decrease
data copy size during backup, as delta file contains some extra info.
The test case should be removed for 10.5 as it will always pass.
2020-10-23 11:02:25 +03:00
Marko Mäkelä
1657b7a583 Merge 10.4 to 10.5 2020-10-22 17:08:49 +03:00
Marko Mäkelä
46957a6a77 Merge 10.3 into 10.4 2020-10-22 13:27:18 +03:00
Marko Mäkelä
e3d692aa09 Merge 10.2 into 10.3 2020-10-22 08:26:28 +03:00
Julius Goryavsky
888010d9dd MDEV-21951: mariabackup SST fail if data-directory have lost+found directory
To fix this, it is necessary to add an option to exclude the
database with the name "lost+found" from processing (the database
name will be checked by the check_if_skip_database_by_path() or
by the check_if_skip_database() function, and as a result
"lost+found" will be skipped).

In addition, it is necessary to slightly modify the verification
logic in the check_if_skip_database() function.

Also added a new test galera_sst_mariabackup_lost_found.test
2020-10-20 12:41:06 +02:00
Marko Mäkelä
1066312a12 MDEV-23982: Mariabackup hangs on backup
MDEV-13318 introduced a condition to Mariabackup that can cause it to
hang if the server goes idle after writing a log block that has no
payload after the 12-byte header. Normal recovery in log0recv.cc would
allow blocks with exactly 12 bytes of length, and only reject blocks
where the length field is shorter than that.
2020-10-19 20:36:05 +03:00
Marko Mäkelä
a0113683d7 Fixup 9028cc6b86
We forgot to change innodb_autoextend_increment from ULONG to
UINT (always 32-bit) in Mariabackup.
2020-10-16 07:51:37 +03:00
Marko Mäkelä
9028cc6b86 Cleanup: Make InnoDB page numbers uint32_t
InnoDB stores a 32-bit page number in page headers and in some
data structures, such as FIL_ADDR (consisting of a 32-bit page number
and a 16-bit byte offset within a page). For better compile-time
error detection and to reduce the memory footprint in some data
structures, let us use a uint32_t for the page number, instead
of ulint (size_t) which can be 64 bits.
2020-10-15 17:06:17 +03:00
Marko Mäkelä
7cffb5f6e8 MDEV-23399: Performance regression with write workloads
The buffer pool refactoring in MDEV-15053 and MDEV-22871 shifted
the performance bottleneck to the page flushing.

The configuration parameters will be changed as follows:

innodb_lru_flush_size=32 (new: how many pages to flush on LRU eviction)
innodb_lru_scan_depth=1536 (old: 1024)
innodb_max_dirty_pages_pct=90 (old: 75)
innodb_max_dirty_pages_pct_lwm=75 (old: 0)

Note: The parameter innodb_lru_scan_depth will only affect LRU
eviction of buffer pool pages when a new page is being allocated. The
page cleaner thread will no longer evict any pages. It used to
guarantee that some pages will remain free in the buffer pool. Now, we
perform that eviction 'on demand' in buf_LRU_get_free_block().
The parameter innodb_lru_scan_depth(srv_LRU_scan_depth) is used as follows:
 * When the buffer pool is being shrunk in buf_pool_t::withdraw_blocks()
 * As a buf_pool.free limit in buf_LRU_list_batch() for terminating
   the flushing that is initiated e.g., by buf_LRU_get_free_block()
The parameter also used to serve as an initial limit for unzip_LRU
eviction (evicting uncompressed page frames while retaining
ROW_FORMAT=COMPRESSED pages), but now we will use a hard-coded limit
of 100 or unlimited for invoking buf_LRU_scan_and_free_block().

The status variables will be changed as follows:

innodb_buffer_pool_pages_flushed: This includes also the count of
innodb_buffer_pool_pages_LRU_flushed and should work reliably,
updated one by one in buf_flush_page() to give more real-time
statistics. The function buf_flush_stats(), which we are removing,
was not called in every code path. For both counters, we will use
regular variables that are incremented in a critical section of
buf_pool.mutex. Note that show_innodb_vars() directly links to the
variables, and reads of the counters will *not* be protected by
buf_pool.mutex, so you cannot get a consistent snapshot of both variables.

The following INFORMATION_SCHEMA.INNODB_METRICS counters will be
removed, because the page cleaner no longer deals with writing or
evicting least recently used pages, and because the single-page writes
have been removed:
* buffer_LRU_batch_flush_avg_time_slot
* buffer_LRU_batch_flush_avg_time_thread
* buffer_LRU_batch_flush_avg_time_est
* buffer_LRU_batch_flush_avg_pass
* buffer_LRU_single_flush_scanned
* buffer_LRU_single_flush_num_scan
* buffer_LRU_single_flush_scanned_per_call

When moving to a single buffer pool instance in MDEV-15058, we missed
some opportunity to simplify the buf_flush_page_cleaner thread. It was
unnecessarily using a mutex and some complex data structures, even
though we always have a single page cleaner thread.

Furthermore, the buf_flush_page_cleaner thread had separate 'recovery'
and 'shutdown' modes where it was waiting to be triggered by some
other thread, adding unnecessary latency and potential for hangs in
relatively rarely executed startup or shutdown code.

The page cleaner was also running two kinds of batches in an
interleaved fashion: "LRU flush" (writing out some least recently used
pages and evicting them on write completion) and the normal batches
that aim to increase the MIN(oldest_modification) in the buffer pool,
to help the log checkpoint advance.

The buf_pool.flush_list flushing was being blocked by
buf_block_t::lock for no good reason. Furthermore, if the FIL_PAGE_LSN
of a page is ahead of log_sys.get_flushed_lsn(), that is, what has
been persistently written to the redo log, we would trigger a log
flush and then resume the page flushing. This would unnecessarily
limit the performance of the page cleaner thread and trigger the
infamous messages "InnoDB: page_cleaner: 1000ms intended loop took 4450ms.
The settings might not be optimal" that were suppressed in
commit d1ab89037a unless log_warnings>2.

Our revised algorithm will make log_sys.get_flushed_lsn() advance at
the start of buf_flush_lists(), and then execute a 'best effort' to
write out all pages. The flush batches will skip pages that were modified
since the log was written, or are are currently exclusively locked.
The MDEV-13670 message "page_cleaner: 1000ms intended loop took" message
will be removed, because by design, the buf_flush_page_cleaner() should
not be blocked during a batch for extended periods of time.

We will remove the single-page flushing altogether. Related to this,
the debug parameter innodb_doublewrite_batch_size will be removed,
because all of the doublewrite buffer will be used for flushing
batches. If a page needs to be evicted from the buffer pool and all
100 least recently used pages in the buffer pool have unflushed
changes, buf_LRU_get_free_block() will execute buf_flush_lists() to
write out and evict innodb_lru_flush_size pages. At most one thread
will execute buf_flush_lists() in buf_LRU_get_free_block(); other
threads will wait for that LRU flushing batch to finish.

To improve concurrency, we will replace the InnoDB ib_mutex_t and
os_event_t native mutexes and condition variables in this area of code.
Most notably, this means that the buffer pool mutex (buf_pool.mutex)
is no longer instrumented via any InnoDB interfaces. It will continue
to be instrumented via PERFORMANCE_SCHEMA.

For now, both buf_pool.flush_list_mutex and buf_pool.mutex will be
declared with MY_MUTEX_INIT_FAST (PTHREAD_MUTEX_ADAPTIVE_NP). The critical
sections of buf_pool.flush_list_mutex should be shorter than those for
buf_pool.mutex, because in the worst case, they cover a linear scan of
buf_pool.flush_list, while the worst case of a critical section of
buf_pool.mutex covers a linear scan of the potentially much longer
buf_pool.LRU list.

mysql_mutex_is_owner(), safe_mutex_is_owner(): New predicate, usable
with SAFE_MUTEX. Some InnoDB debug assertions need this predicate
instead of mysql_mutex_assert_owner() or mysql_mutex_assert_not_owner().

buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list:
Replaces buf_pool_t::init_flush[] and buf_pool_t::n_flush[].
The number of active flush operations.

buf_pool_t::mutex, buf_pool_t::flush_list_mutex: Use mysql_mutex_t
instead of ib_mutex_t, to have native mutexes with PERFORMANCE_SCHEMA
and SAFE_MUTEX instrumentation.

buf_pool_t::done_flush_LRU: Condition variable for !n_flush_LRU.

buf_pool_t::done_flush_list: Condition variable for !n_flush_list.

buf_pool_t::do_flush_list: Condition variable to wake up the
buf_flush_page_cleaner when a log checkpoint needs to be written
or the server is being shut down. Replaces buf_flush_event.
We will keep using timed waits (the page cleaner thread will wake
_at least_ once per second), because the calculations for
innodb_adaptive_flushing depend on fixed time intervals.

buf_dblwr: Allocate statically, and move all code to member functions.
Use a native mutex and condition variable. Remove code to deal with
single-page flushing.

buf_dblwr_check_block(): Make the check debug-only. We were spending
a significant amount of execution time in page_simple_validate_new().

flush_counters_t::unzip_LRU_evicted: Remove.

IORequest: Make more members const. FIXME: m_fil_node should be removed.

buf_flush_sync_lsn: Protect by std::atomic, not page_cleaner.mutex
(which we are removing).

page_cleaner_slot_t, page_cleaner_t: Remove many redundant members.

pc_request_flush_slot(): Replaces pc_request() and pc_flush_slot().

recv_writer_thread: Remove. Recovery works just fine without it, if we
simply invoke buf_flush_sync() at the end of each batch in
recv_sys_t::apply().

recv_recovery_from_checkpoint_finish(): Remove. We can simply call
recv_sys.debug_free() directly.

srv_started_redo: Replaces srv_start_state.

SRV_SHUTDOWN_FLUSH_PHASE: Remove. logs_empty_and_mark_files_at_shutdown()
can communicate with the normal page cleaner loop via the new function
flush_buffer_pool().

buf_flush_remove(): Assert that the calling thread is holding
buf_pool.flush_list_mutex. This removes unnecessary mutex operations
from buf_flush_remove_pages() and buf_flush_dirty_pages(),
which replace buf_LRU_flush_or_remove_pages().

buf_flush_lists(): Renamed from buf_flush_batch(), with simplified
interface. Return the number of flushed pages. Clarified comments and
renamed min_n to max_n. Identify LRU batch by lsn=0. Merge all the functions
buf_flush_start(), buf_flush_batch(), buf_flush_end() directly to this
function, which was their only caller, and remove 2 unnecessary
buf_pool.mutex release/re-acquisition that we used to perform around
the buf_flush_batch() call. At the start, if not all log has been
durably written, wait for a background task to do it, or start a new
task to do it. This allows the log write to run concurrently with our
page flushing batch. Any pages that were skipped due to too recent
FIL_PAGE_LSN or due to them being latched by a writer should be flushed
during the next batch, unless there are further modifications to those
pages. It is possible that a page that we must flush due to small
oldest_modification also carries a recent FIL_PAGE_LSN or is being
constantly modified. In the worst case, all writers would then end up
waiting in log_free_check() to allow the flushing and the checkpoint
to complete.

buf_do_flush_list_batch(): Clarify comments, and rename min_n to max_n.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_flush_space(): Auxiliary function to look up a tablespace for
page flushing.

buf_flush_page(): Defer the computation of space->full_crc32(). Never
call log_write_up_to(), but instead skip persistent pages whose latest
modification (FIL_PAGE_LSN) is newer than the redo log. Also skip
pages on which we cannot acquire a shared latch without waiting.

buf_flush_try_neighbors(): Do not bother checking buf_fix_count
because buf_flush_page() will no longer wait for the page latch.
Take the tablespace as a parameter, and only execute this function
when innodb_flush_neighbors>0. Avoid repeated calls of page_id_t::fold().

buf_flush_relocate_on_flush_list(): Declare as cold, and push down
a condition from the callers.

buf_flush_check_neighbor(): Take id.fold() as a parameter.

buf_flush_sync(): Ensure that the buf_pool.flush_list is empty,
because the flushing batch will skip pages whose modifications have
not yet been written to the log or were latched for modification.

buf_free_from_unzip_LRU_list_batch(): Remove redundant local variables.

buf_flush_LRU_list_batch(): Let the caller buf_do_LRU_batch() initialize
the counters, and report n->evicted.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_do_LRU_batch(): Return the number of pages flushed.

buf_LRU_free_page(): Only release and re-acquire buf_pool.mutex if
adaptive hash index entries are pointing to the block.

buf_LRU_get_free_block(): Do not wake up the page cleaner, because it
will no longer perform any useful work for us, and we do not want it
to compete for I/O while buf_flush_lists(innodb_lru_flush_size, 0)
writes out and evicts at most innodb_lru_flush_size pages. (The
function buf_do_LRU_batch() may complete after writing fewer pages if
more than innodb_lru_scan_depth pages end up in buf_pool.free list.)
Eliminate some mutex release-acquire cycles, and wait for the LRU
flush batch to complete before rescanning.

buf_LRU_check_size_of_non_data_objects(): Simplify the code.

buf_page_write_complete(): Remove the parameter evict, and always
evict pages that were part of an LRU flush.

buf_page_create(): Take a pre-allocated page as a parameter.

buf_pool_t::free_block(): Free a pre-allocated block.

recv_sys_t::recover_low(), recv_sys_t::apply(): Preallocate the block
while not holding recv_sys.mutex. During page allocation, we may
initiate a page flush, which in turn may initiate a log flush, which
would require acquiring log_sys.mutex, which should always be acquired
before recv_sys.mutex in order to avoid deadlocks. Therefore, we must
not be holding recv_sys.mutex while allocating a buffer pool block.

BtrBulk::logFreeCheck(): Skip a redundant condition.

row_undo_step(): Do not invoke srv_inc_activity_count() for every row
that is being rolled back. It should suffice to invoke the function in
trx_flush_log_if_needed() during trx_t::commit_in_memory() when the
rollback completes.

sync_check_enable(): Remove. We will enable innodb_sync_debug from the
very beginning.

Reviewed by: Vladislav Vaintroub
2020-10-15 17:04:56 +03:00
Marko Mäkelä
a9550c47e4 MDEV-16264 fixup: Remove unused code and data
LATCH_ID_OS_AIO_READ_MUTEX,
LATCH_ID_OS_AIO_WRITE_MUTEX,
LATCH_ID_OS_AIO_LOG_MUTEX,
LATCH_ID_OS_AIO_IBUF_MUTEX,
LATCH_ID_OS_AIO_SYNC_MUTEX: Remove. The tpool is not instrumented.

lock_set_timeout_event(): Remove.

srv_sys_mutex_key, srv_sys_t::mutex, SYNC_THREADS: Remove.

srv_slot_t::suspended: Remove. We only ever assigned this data member
true, so it is redundant.

ib_wqueue_wait(), ib_wqueue_timedwait(): Remove.

os_thread_join(): Remove.

os_thread_create(), os_thread_exit(): Remove redundant parameters.

These were missed in commit 5e62b6a5e0.
2020-09-30 14:28:11 +03:00
Marko Mäkelä
6ce0a6f9ad Merge 10.5 into 10.6 2020-09-24 10:21:26 +03:00
Marko Mäkelä
882ce206db Merge 10.4 into 10.5 2020-09-23 11:32:43 +03:00
Marko Mäkelä
952a028a52 Merge 10.3 into 10.4
We omit commit a3bdce8f1e
and commit a0e2a293bc
because they would make the test galera_3nodes.galera_gtid_2_cluster
fail and disable it.
2020-09-21 17:42:02 +03:00
Marko Mäkelä
2cf489d430 Merge 10.2 into 10.3 2020-09-21 16:39:23 +03:00
Marko Mäkelä
407d170c92 MDEV-23711 fixup: GCC -Og -Wmaybe-uninitialized 2020-09-21 16:29:29 +03:00
Vlad Lesin
0a224edc3e MDEV-23711 make mariabackup innodb redo log read error message more clear
log_group_read_log_seg() returns error when:

1) Calculated log block number does not correspond to read log block
number. This can be caused by:
  a) Garbage or an incompletely written log block. We can exclude this
  case by checking log block checksum if it's enabled(see innodb-log-checksums,
  encrypted log block contains checksum always).
  b) The log block is overwritten. In this case checksum will be correct and
  read log block number will be greater then requested one.

2) When log block length is wrong. In this case recv_sys->found_corrupt_log
is set.

3) When redo log block checksum is wrong. In this case innodb code
writes messages to error log with the following prefix: "Invalid log
block checksum."

The fix processes all the cases above.
2020-09-21 12:29:52 +03:00
Marko Mäkelä
3a423088ac Merge 10.3 into 10.4 2020-09-21 12:29:00 +03:00
Marko Mäkelä
cbcb4ecabb Merge 10.2 into 10.3 2020-09-21 11:04:04 +03:00
Vladislav Vaintroub
ccbe6bb6fc MDEV-19935 Create unified CRC-32 interface
Add CRC32C code to mysys. The x86-64 implementation uses PCMULQDQ in addition to CRC32 instruction
after Intel whitepaper, and is ported from rocksdb code.

Optimized ARM and POWER CRC32 were already present in mysys.
2020-09-17 16:07:37 +02:00
Vlad Lesin
80075ba011 MDEV-19264 Better support MariaDB GTID for Mariabackup's --slave-info option
Parse SHOW SLAVE STATUS output for the "Using_Gtid" column. If the value
is "No", then old log file and position is backed up, otherwise gtid_slave_pos
is backed up.
2020-09-14 11:14:50 +03:00
Marko Mäkelä
938db04898 Cleanup: Remove os0proc.* 2020-09-03 16:40:42 +03:00
Oleksandr Byelkin
5edf3e0388 Merge branch '10.5' into 10.6 2020-09-02 14:36:14 +02:00
Marko Mäkelä
2fa9f8c53a Merge 10.3 into 10.4 2020-08-20 11:01:47 +03:00
Marko Mäkelä
de0e7cd72a Merge 10.2 into 10.3 2020-08-20 09:12:16 +03:00
Marko Mäkelä
309302a3da MDEV-23475 InnoDB performance regression for write-heavy workloads
In commit fe39d02f51 (MDEV-20638)
we removed some wake-up signaling of the master thread that should
have been there, to ensure a steady log checkpointing workload.

Common sense suggests that the commit omitted some necessary calls
to srv_inc_activity_count(). But, an attempt to add the call to
trx_flush_log_if_needed_low() as well as to reinstate the function
innobase_active_small() did not restore the performance for the
case where sync_binlog=1 is set.

Therefore, we will revert the entire commit in MariaDB Server 10.2.
In MariaDB Server 10.5, adding a srv_inc_activity_count() call to
trx_flush_log_if_needed_low() did restore the performance, so we
will not revert MDEV-20638 across all versions.
2020-08-19 11:18:56 +03:00
Marko Mäkelä
4c50120d14 MDEV-23474 InnoDB fails to restart after SET GLOBAL innodb_log_checksums=OFF
Regretfully, the parameter innodb_log_checksums was introduced
in MySQL 5.7.9 (the first GA release of that series) by
mysql/mysql-server@af0acedd88
which partly replaced a parameter that had been introduced in 5.7.8
mysql/mysql-server@22ba38218e
as innodb_log_checksum_algorithm.

Given that the CRC-32C operations are accelerated on many processor
implementations (AMD64 with SSE4.2; since MDEV-22669 also on IA-32
with SSE4.2, POWER 8 and later, ARMv8 with some extensions)
and by lookup tables when only generic SISD instructions are available,
there should be no valid reason to disable checksums.

In MariaDB 10.5.2, as a preparation for MDEV-12353, MDEV-19543 deprecated
and ignored the parameter innodb_log_checksums altogether. This should
imply that after a clean shutdown with innodb_log_checksums=OFF one
cannot upgrade to MariaDB Server 10.5 at all.

Due to these problems, let us deprecate the parameter innodb_log_checksums
and honor it only during server startup.
The command SET GLOBAL innodb_log_checksums will always set the
parameter to ON.
2020-08-18 16:46:07 +03:00
Marko Mäkelä
cf87f3e08c Merge 10.4 into 10.5 2020-08-14 11:33:35 +03:00
Marko Mäkelä
2f7b37b021 Merge 10.3 into 10.4, except MDEV-22543
Also, fix GCC -Og -Wmaybe-uninitialized in run_backup_stage()
2020-08-13 18:48:41 +03:00
Marko Mäkelä
4bd56a697f Merge 10.2 into 10.3 2020-08-13 18:18:25 +03:00
Marko Mäkelä
31aef3ae99 Fix GCC 10.2.0 -Og -Wmaybe-uninitialized
For some reason, GCC emits more -Wmaybe-uninitialized warnings
when using the flag -Og than when using -O2. Many of the warnings
look genuine.
2020-08-11 15:58:16 +03:00
Marko Mäkelä
e685809a3b MDEV-23397 Remove deprecated InnoDB options in 10.6 2020-08-04 12:51:59 +03:00
Marko Mäkelä
9a7948e3f6 Merge 10.5 into 10.6 2020-08-04 07:55:16 +03:00
Marko Mäkelä
bbd70fcc43 MDEV-23379 Deprecate&ignore InnoDB concurrency throttling parameters
The parameters innodb_thread_concurrency and innodb_commit_concurrency
were useful years ago when both computing resources and the implementation
of some shared data structures were limited. MySQL 5.0 or 5.1 had trouble
scaling beyond 8 concurrent connections. Most of the scalability bottlenecks
have been removed since then, and the transactions per second delivered
by MariaDB Server 10.5 should not dramatically drop upon exceeding the
'optimal' number of connections.

Hence, enabling any concurrency throttling for InnoDB actually makes
things worse. We have seen many customers mistakenly setting this to a
small value like 16 or 64 and then complaining the server was slow.

Ignoring the parameters allows us to remove some normally unused code
and data structures, which could slightly improve performance.

innodb_thread_concurrency, innodb_commit_concurrency,
innodb_replication_delay, innodb_concurrency_tickets,
innodb_thread_sleep_delay, innodb_adaptive_max_sleep_delay:
Deprecate and ignore; hard-wire to 0.

The column INFORMATION_SCHEMA.INNODB_TRX.trx_concurrency_tickets
will always report 0.
2020-08-04 06:59:29 +03:00
Marko Mäkelä
50a11f396a Merge 10.4 into 10.5 2020-08-01 14:42:51 +03:00
Marko Mäkelä
9216114ce7 Merge 10.3 into 10.4 2020-07-31 18:09:08 +03:00
Marko Mäkelä
66ec3a770f Merge 10.2 into 10.3 2020-07-31 13:51:28 +03:00
Thirunarayanan Balathandayuthapani
fe39d02f51 MDEV-20638 Remove the deadcode from srv_master_thread() and srv_active_wake_master_thread_low()
- Due to commit fe95cb2e40 (MDEV-16125),
InnoDB master thread does not need to call srv_resume_thread()
and therefore there is no need to wake up the thread.
Due to the above patch, InnoDB should remove the following dead code.

srv_check_activity(): Makes the parameter as in,out and returns the
recent activity value

innobase_active_small(): Removed

srv_active_wake_master_thread(): Removed

srv_wake_master_thread(): Removed

srv_active_wake_master_thread_low(): Removed

Simplify srv_master_thread() and remove switch cases, added the assert.

Replace srv_wake_master_thread() with srv_inc_activity_count()

INNOBASE_WAKE_INTERVAL: Removed
2020-07-23 16:23:20 +05:30
Vladislav Vaintroub
9701759b3d MDEV-23043 Refactor Windows service handling
Removed the existing nt_service classes - they provide little
abstraction, and only obscure a relatively simple service handling.
This replaces by similar code inspired by MS docs samples.

Service handling is now moved into winmain.cc, which contains
the main() function for Windows.

winmain provides reporting callbacks, which should be used by external code
,to report transitions from starting to running to shutting down to stopped.

Removed a  do-nothing ServiceMain thread, and the
non-working service "pause/continue".  Removed a lot of #ifdef __WIN__
code from mysqld.cc
2020-07-04 18:24:40 +02:00
Vladislav Vaintroub
272828a171 Merge branch '10.5' into 10.6 2020-07-04 11:53:26 +02:00
Marko Mäkelä
1813d92d0c Merge 10.4 into 10.5 2020-07-02 09:41:44 +03:00
Oleksandr Byelkin
b0f836053b MDEV-22983: Mariabackup's --help option disappeared
return --help option
2020-07-01 23:30:44 +03:00
Oleksandr Byelkin
5018b998a7 return --help option 2020-06-23 17:07:03 +02:00
Marko Mäkelä
cfd3d70ccb MDEV-22871: Remove pointer indirection for InnoDB hash_table_t
hash_get_n_cells(): Remove. Access n_cells directly.

hash_get_nth_cell(): Remove. Access array directly.

hash_table_clear(): Replaced with hash_table_t::clear().

hash_table_create(), hash_table_free(): Remove.

hash0hash.cc: Remove.
2020-06-18 14:16:01 +03:00
Marko Mäkelä
c515b1d092 Merge 10.4 into 10.5 2020-06-18 13:58:54 +03:00
Vlad Lesin
205b0ce6ad MDEV-22894: Mariabackup should not read [mariadb-client] option group
from configuration files
2020-06-18 12:19:48 +03:00
Vlad Lesin
9bdf35e90f MDEV-18215: mariabackup does not report unknown command line options
MDEV-21298: mariabackup doesn't read from the [mariadbd] and [mariadbd-X.Y]
server option groups from configuration files
MDEV-21301: mariabackup doesn't read [mariadb-backup] option group in
configuration file

All three issues require to change the same code, that is why their
fixes are joined in one commit.

The fix is in invoking load_defaults_or_exit() and handle_options() for
backup-specific groups separately from client-server groups to let the last
handle_options() call fail on unknown backup-specific options.

The order of options procesing is the following:
1) Load server groups and process server options, ignore unknown
options
2) Load client groups and process client options, ignore unknown
options
3) Load backup groups and process client-server options, exit on
unknown option
4) Process --mysqld-args command line options, ignore unknown options

New global flag my_handle_options_init_variables was added to have
ability to invoke handle_options() for the same allowed options set
several times without re-initialising previously set option values.

--password value destroying is moved from option processing callback to
mariabackup's handle_options() function to have ability to invoke server's
handle_options() several times for the same possible allowed options
set.

Galera invokes wsrep_sst_mariabackup.sh with mysqld command line
options to configure mariabackup as close to the server as possible.
It is not known what server options are supported by mariabackup when the
script is invoked. That is why new mariabackup option "--mysqld-args" is added,
all unknown options that follow this option will be silently ignored.

wsrep_sst_mariabackup.sh was also changed to:
- use "--mysqld-args" mariabackup option to pass mysqld options,
- remove deprecated innobackupex mode,
- remove unsupported mariabackup options:
    --encrypt
    --encrypt-key
    --rebuild-indexes
    --rebuild-threads
2020-06-14 13:23:07 +03:00
mysqlonarm
dec3f8ca69
MDEV-22641: Provide SIMD optimized wrapper for zlib crc32() (#1558)
Existing implementation used my_checksum (from mysys)
for calculating table checksum and binlog checksum.

This implementation was optimized for powerpc only and lacked
SIMD implementation for x86 (using clmul) and ARM
(using ACLE) instead used zlib-crc32.

mariabackup had its own copy of the crc32 implementation
using hardware optimized implementation only for x86 and lagged
hardware based implementation for powerpc and ARM.

Patch helps unifies all such calls and help aggregate all of them
using an unified interface my_checksum().

Said unification also enables hardware optimized calls for all
architecture viz. x86, ARM, POWERPC.
Default always fallback to zlib crc32.

Thanks to Daniel Black for reviewing, fixing and testing
PowerPC changes. Thanks to Marko and Daniel for early code feedback.
2020-06-01 11:34:06 +03:00
Marko Mäkelä
5ece2155cb Merge 10.4 into 10.5 2020-05-20 17:46:05 +03:00
Marko Mäkelä
2bf93a8fd6 Merge 10.3 into 10.4 2020-05-19 21:18:15 +03:00
Marko Mäkelä
79ed33c184 Merge 10.2 into 10.3 2020-05-19 17:05:05 +03:00
Vlad Lesin
0f9bfcc323 MDEV-22554: "mariabackup --prepare" exits with code 0 even though innodb
error is logged

The fix is to set flag in ib::error::~error() and check it in
mariabackup.

ib::error::error() is replaced with ib::warn::warn() in
AIO::linux_create_io_ctx() because of two reasons:

1) if we leave it as is, then mariabackup MTR tests will fail with --mem
option, because Linux AIO can not be used on tmpfs,

2) when Linux AIO can not be initialized, InnoDB falls back to simulated
AIO, so such sutiation is not fatal error, it should be treated as warning.
2020-05-19 11:25:56 +03:00
Vladislav Vaintroub
e2bc029211 MDEV-7021 Pass directory security descriptor from mysql_install_db.exe to bootstrap
This ensures that directory permissions are correct in all cases, even if
boostrap is passed non-standard locations for innodb.

Directory permissions are copied from the datadir.
2020-05-18 18:11:40 +02:00
Vladislav Vaintroub
8cb3060c5c MDEV-22272 Windows installer - run service unter virtual service account
Change mysql_install_db.exe to run service under virtual account.
Set directory permissions so that service has full access to data files.

mariabackup --copy-back permission handling (MDEV-17008) needs to be
changed as well.
Now, whenever a directory is created in course of copy-back,
its permissions are copied from the datadir. This handling assumes,
that datadir already has the correct permissions for the Windows service.
2020-05-18 18:07:26 +02:00
Marko Mäkelä
18a62eb76d MDEV-21133 follow-up: Use fil_page_get_type()
Let us use the common accessor function fil_page_get_type()
instead of accessing the page header field FIL_PAGE_TYPE directly.
2020-05-07 17:15:34 +03:00
Eugene Kosov
89ff4176c1 MDEV-22437 make THR_THD* variable thread_local
Now all access goes through _current_thd() and set_current_thd()
functions.

Some functions like THD::store_globals() can not fail now.
2020-05-05 18:13:31 +03:00
Marko Mäkelä
496d0372ef Merge 10.4 into 10.5 2020-04-29 15:40:51 +03:00
Marko Mäkelä
0632b8034b Merge 10.3 into 10.4 2020-04-29 09:05:15 +03:00
Marko Mäkelä
1fbdcada73 Merge 10.2 into 10.3 2020-04-28 22:29:13 +03:00
Vlad Lesin
d0150dc14e MDEV-20230: mariabackup --ftwrl-wait-timeout never times out on explicit
lock

--ftwrl-wait-timeout does not finish mariabackup execution when acquired
backup lock can't be grabbed for the certain amount of time, it just
waits for a long queries finishing before acquiring the lock to avoid
unnecessary locking.

This commit extends --ftwrl-wait-timeout so, that mariabackup execution
is finished if it waits for backup lock during certain amount of time.
2020-04-27 22:10:50 +03:00
Marko Mäkelä
fbe2712705 Merge 10.4 into 10.5
The functional changes of commit 5836191c8f
(MDEV-21168) are omitted due to MDEV-742 having addressed the issue.
2020-04-25 21:57:52 +03:00
Marko Mäkelä
88cf6f1c7f Merge 10.3 into 10.4 2020-04-22 18:18:51 +03:00
Marko Mäkelä
455cf6196c Merge 10.2 into 10.3 2020-04-22 14:45:55 +03:00
Vlad Lesin
0efe1971c6 MDEV-19347: Mariabackup does not honor ignore_db_dirs from server
config.

The solution is to read the system variable value on startup and to fill
databases_exclude_hash.

xb_load_list_string() became non-static and was reformatted. The system
variable value is read and processed in get_mysql_vars(), which was also
reformatted.
2020-04-21 10:34:37 +03:00
Marko Mäkelä
af91266498 Merge 10.3 into 10.4
In main.index_merge_myisam we remove the test that was added in
commit a2d24def8c because
it duplicates the test case that was added in
commit 5af12e4635.
2020-04-16 12:12:26 +03:00
Marko Mäkelä
84db10f27b Merge 10.2 into 10.3 2020-04-15 09:56:03 +03:00
Thirunarayanan Balathandayuthapani
6bbc0eedc6 MDEV-22193 Avoid un-necessary page initialization during recovery
- InnoDB is doing un-necessary redo log page initialisation during
recovery and unnecessary traversal of redo log during last phase.
This patch does the optimization of removing unnecessary redo log page
initialisation and detects the memory exhaust earlier.
2020-04-09 21:25:31 +05:30
Vlad Lesin
5836191c8f MDEV-21168: Active XA transactions stop slave from working after backup
was restored.

Optionally rollback prepared XA's on "mariabackup --prepare".

The fix MUST NOT be ported on 10.5+, as MDEV-742 fix solves the issue for
slaves.
2020-04-07 15:05:38 +03:00
Daniel Black
e8351934b6
Merge pull request #1221 from grooverdan/10.4-MDEV-18851-multiple-sized-large-page-support
MDEV-18851: multiple sized large page support (linux)
2020-04-02 23:54:08 +04:00
Marko Mäkelä
aae3f921ad Cleanup recv_sys: Move things to members
recv_sys.recovery_on: Replaces recv_recovery_on.

recv_sys_t::apply(): Replaces recv_apply_hashed_log_recs().

recv_sys_var_init(): Remove.

recv_sys_t::recover_low(): Attempt to initialize a page based
on buffered redo log records.
2020-03-30 18:45:09 +03:00
Vladislav Vaintroub
1b58ef7d3f Build cleanups.
Fix clang-cl built
2020-03-25 15:53:38 +01:00
Rasmus Johansson
9e1b3af4a4 MDEV-21303 Make executables MariaDB named
To change all executables to have a mariadb name I had to:
- Do name changes in every CMakeLists.txt that produces executables
- CREATE_MARIADB_SYMLINK was removed and GET_SYMLINK added by Wlad to reuse the function in other places also
- The scripts/CMakeLists.txt could make use of GET_SYMLINK instead of introducing redundant code, but I thought I'll leave that for next release
- A lot of changes to debian/.install and debian/.links files due to swapping of real executable and symlink. I did not however change the name of the manpages, so the real name is still mysql there and mariadb are symlinks.
- The Windows part needed a change now when we made the executables mariadb -named. MSI (and ZIP) do not support symlinks and to not break backward compatibility we had to include mysql named binaries also. Done by Wlad
2020-03-21 20:20:29 +01:00
Sergei Golubchik
91d1588d30 Merge branch 'github/10.5' into 10.5 2020-03-14 09:52:35 +01:00
Marko Mäkelä
f224525204 MDEV-21907: InnoDB: Enable -Wconversion on clang and GCC
The -Wconversion in GCC seems to be stricter than in clang.
GCC at least since version 4.4.7 issues truncation warnings for
assignments to bitfields, while clang 10 appears to only issue
warnings when the sizes in bytes rounded to the nearest integer
powers of 2 are different.

Before GCC 10.0.0, -Wconversion required more casts and would not
allow some operations, such as x<<=1 or x+=1 on a data type that
is narrower than int.

GCC 5 (but not GCC 4, GCC 6, or any later version) is complaining
about x|=y even when x and y are compatible types that are narrower
than int.  Hence, we must rewrite some x|=y as
x=static_cast<byte>(x|y) or similar, or we must disable -Wconversion.

In GCC 6 and later, the warning for assigning wider to bitfields
that are narrower than 8, 16, or 32 bits can be suppressed by
applying a bitwise & with the exact bitmask of the bitfield.
For older GCC, we must disable -Wconversion for GCC 4 or 5 in such
cases.

The bitwise negation operator appears to promote short integers
to a wider type, and hence we must add explicit truncation casts
around them. Microsoft Visual C does not allow a static_cast to
truncate a constant, such as static_cast<byte>(1) truncating int.
Hence, we will use the constructor-style cast byte(~1) for such cases.

This has been tested at least with GCC 4.8.5, 5.4.0, 7.4.0, 9.2.1, 10.0.0,
clang 9.0.1, 10.0.0, and MSVC 14.22.27905 (Microsoft Visual Studio 2019)
on 64-bit and 32-bit targets (IA-32, AMD64, POWER 8, POWER 9, ARMv8).
2020-03-12 19:46:41 +02:00
Oleksandr Byelkin
fad47df995 Merge branch '10.4' into 10.5 2020-03-11 17:52:49 +01:00
Oleksandr Byelkin
b7362d5fbc Merge branch '10.3' into 10.4 2020-03-11 14:28:24 +01:00
Oleksandr Byelkin
3c9bc0ce19 Merge branch '10.2' into 10.3 2020-03-11 14:05:41 +01:00
Marko Mäkelä
574d8b2940 MDEV-21907: Fix most clang -Wconversion in InnoDB
Declare innodb_purge_threads as 4-byte integer (UINT)
instead of 4-or-8-byte (ULONG) and adjust the documentation string.
2020-03-11 08:29:48 +02:00
Sergei Golubchik
c1c5222cae cleanup: PSI key is *always* the first argument 2020-03-10 19:24:23 +01:00
Sergei Golubchik
7c58e97bf6 perfschema memory related instrumentation changes 2020-03-10 19:24:22 +01:00
Eugene Kosov
2b8b85bd0a fix use-after-free 2020-03-10 15:14:53 +03:00
Marko Mäkelä
4383897a01 MDEV-14425 preparation: Remove log_header_read()
The function log_header_read() was only used during server startup,
and it will mostly be used only for reading checkpoint information
from pre-MDEV-14425 format redo log files.

Let us replace the function with more direct calls, so that
it is clearer what is going on. It is not strictly necessary to
hold any mutex during this operation, and because there will be
only a limited number of operations during early server startup,
it is not necessary to increment any I/O counters.
2020-03-04 10:08:33 +02:00
Marko Mäkelä
8511f04fdb Cleanup: Remove srv_start_lsn
Most of the time, we can refer to recv_sys.recovered_lsn.
2020-03-02 15:01:46 +02:00
Eugene Kosov
9ef2d29ff4 MDEV-14425 deprecate and ignore innodb_log_files_in_group
Now there can be only one log file instead of several which
logically work as a single file.

Possible names of redo log files: ib_logfile0,
ib_logfile101 (for just created one)

innodb_log_fiels_in_group: value of this variable is not used
by InnoDB. Possible values are still 1..100, to not break upgrade

LOG_FILE_NAME: add constant of value "ib_logfile0"
LOG_FILE_NAME_PREFIX: add constant of value "ib_logfile"

get_log_file_path(): convenience function that returns full
path of a redo log file

SRV_N_LOG_FILES_MAX: removed

srv_n_log_files: we can't remove this for compatibility reasons,
but now server doesn't use this variable

log_sys_t::file::fd: now just one, not std::vector

log_sys_t::log_capacity: removed word 'group'

find_and_check_log_file(): part of logic from huge srv_start()
moved here

recv_sys_t::files: file descriptors of redo log files.
There can be several of those in case we're upgrading
from older MariaDB version.

recv_sys_t::remove_extra_log_files: whether to remove
ib_logfile{1,2,3...} after successfull upgrade.

recv_sys_t::read(): open if needed and read from one
of several log files

recv_sys_t::files_size(): open if needed and return files count

redo_file_sizes_are_correct(): check that redo log files
sizes are equal. Just to log an error for a user.
Corresponding check was moved from srv0start.cc

namespace deprecated: put all deprecated variables here to
prevent usage of it by us, developers
2020-02-19 12:21:59 +03:00
Marko Mäkelä
f8a9f90667 MDEV-12353: Remove support for crash-upgrade
We tighten some assertions regarding dict_index_t::is_dummy
and crash recovery, now that redo log processing will
no longer create dummy objects.
2020-02-13 19:13:45 +02:00
Marko Mäkelä
7ae21b18a6 MDEV-12353: Change the redo log encoding
log_t::FORMAT_10_5: physical redo log format tag

log_phys_t: Buffered records in the physical format.
The log record bytes will follow the last data field,
making use of alignment padding that would otherwise be wasted.
If there are multiple records for the same page, also those
may be appended to an existing log_phys_t object if the memory
is available.

In the physical format, the first byte of a record identifies the
record and its length (up to 15 bytes). For longer records, the
immediately following bytes will encode the remaining length
in a variable-length encoding. Usually, a variable-length-encoded
page identifier will follow, followed by optional payload, whose
length is included in the initially encoded total record length.

When a mini-transaction is updating multiple fields in a page,
it can avoid repeating the tablespace identifier and page number
by setting the same_page flag (most significant bit) in the first
byte of the log record. The byte offset of the record will be
relative to where the previous record for that page ended.

Until MDEV-14425 introduces a separate file-level log for
redo log checkpoints and file operations, we will write the
file-level records in the page-level redo log file.
The record FILE_CHECKPOINT (which replaces MLOG_CHECKPOINT)
will be removed in MDEV-14425, and one sequential scan of the
page recovery log will suffice.

Compared to MLOG_FILE_CREATE2, FILE_CREATE will not include any flags.
If the information is needed, it can be parsed from WRITE records that
modify FSP_SPACE_FLAGS.

MLOG_ZIP_WRITE_STRING: Remove. The record was only introduced temporarily
as part of this work, before being replaced with WRITE (along with
MLOG_WRITE_STRING, MLOG_1BYTE, MLOG_nBYTES).

mtr_buf_t::empty(): Check if the buffer is empty.

mtr_t::m_n_log_recs: Remove. It suffices to check if m_log is empty.

mtr_t::m_last, mtr_t::m_last_offset: End of the latest m_log record,
for the same_page encoding.

page_recv_t::last_offset: Reflects mtr_t::m_last_offset.

Valid values for last_offset during recovery should be 0 or above 8.
(The first 8 bytes of a page are the checksum and the page number,
and neither are ever updated directly by log records.)
Internally, the special value 1 indicates that the same_page form
will not be allowed for the subsequent record.

mtr_t::page_create(): Take the block descriptor as parameter,
so that it can be compared to mtr_t::m_last. The INIT_INDEX_PAGE
record will always followed by a subtype byte, because same_page
records must be longer than 1 byte.

trx_undo_page_init(): Combine the writes in WRITE record.

trx_undo_header_create(): Write 4 bytes using a special MEMSET
record that includes 1 bytes of length and 2 bytes of payload.

flst_write_addr(): Define as a static function. Combine the writes.

flst_zero_both(): Replaces two flst_zero_addr() calls.

flst_init(): Do not inline the function.

fsp_free_seg_inode(): Zerofill the whole inode.

fsp_apply_init_file_page(): Initialize FIL_PAGE_PREV,FIL_PAGE_NEXT
to FIL_NULL when using the physical format.

btr_create(): Assert !page_has_siblings() because fsp_apply_init_file_page()
must have been invoked.

fil_ibd_create(): Do not write FILE_MODIFY after FILE_CREATE.

fil_names_dirty_and_write(): Remove the parameter mtr.
Write the records using a separate mini-transaction object,
because any FILE_ records must be at the start of a mini-transaction log.

recv_recover_page(): Add a fil_space_t* parameter.
After applying log to the a ROW_FORMAT=COMPRESSED page,
invoke buf_zip_decompress() to restore the uncompressed page.

buf_page_io_complete(): Remove the temporary hack to discard the
uncompressed page of a ROW_FORMAT=COMPRESSED page.

page_zip_write_header(): Remove. Use mtr_t::write() or
mtr_t::memset() instead, and update the compressed page frame
separately.

trx_undo_header_add_space_for_xid(): Remove.

trx_undo_seg_create(): Perform the changes that were previously
made by trx_undo_header_add_space_for_xid().

btr_reset_instant(): New function: Reset the table to MariaDB 10.2
or 10.3 format when rolling back an instant ALTER TABLE operation.

page_rec_find_owner_rec(): Merge with the only callers.

page_cur_insert_rec_low(): Combine writes by using a local buffer.
MEMMOVE data from the preceding record whenever feasible
(copying at least 3 bytes).

page_cur_insert_rec_zip(): Combine writes to page header fields.

PageBulk::insertPage(): Issue MEMMOVE records to copy a matching
part from the preceding record.

PageBulk::finishPage(): Combine the writes to the page header
and to the sparse page directory slots.

mtr_t::write(): Only log the least significant (last) bytes
of multi-byte fields that actually differ.

For updating FSP_SIZE, we must always write all 4 bytes to the
redo log, so that the fil_space_set_recv_size() logic in
recv_sys_t::parse() will work.

mtr_t::memcpy(), mtr_t::zmemcpy(): Take a pointer argument
instead of a numeric offset to the page frame. Only log the
last bytes of multi-byte fields that actually differ.

In fil_space_crypt_t::write_page0(), we must log also any
unchanged bytes, so that recovery will recognize the record
and invoke fil_crypt_parse().

Future work:
MDEV-21724 Optimize page_cur_insert_rec_low() redo logging
MDEV-21725 Optimize btr_page_reorganize_low() redo logging
MDEV-21727 Optimize redo logging for ROW_FORMAT=COMPRESSED
2020-02-13 19:12:17 +02:00
Marko Mäkelä
67c76704a8 MDEV-12353: Remove MLOG_INDEX_LOAD (innodb_log_optimize_ddl)
NOTE: This may break crash-upgrade from a dataset that was
created with innodb_log_optimize_ddl=ON. Also due to
ROW_FORMAT=COMPRESSED pages, it will be easiest to disallow
crash-upgrade.

It would be more robust to disable the MDEV-12699 logic when
crash-upgrading from old redo log format.

log_optimized_ddl_op: Remove.

fil_space_t::enable_lsn, file_name_t::enable_lsn: Remove.

ddl_tracker_t::optimized_ddl: Remove.

TODO: Remove ddl_tracker
2020-02-13 18:19:15 +02:00
Marko Mäkelä
1a6f708ec5 MDEV-15058: Deprecate and ignore innodb_buffer_pool_instances
Our benchmarking efforts indicate that the reasons for splitting the
buf_pool in commit c18084f71b
have mostly gone away, possibly as a result of
mysql/mysql-server@ce6109ebfd
or similar work.

Only in one write-heavy benchmark where the working set size is
ten times the buffer pool size, the buf_pool->mutex would be
less contended with 4 buffer pool instances than with 1 instance,
in buf_page_io_complete(). That contention could be alleviated
further by making more use of std::atomic and by splitting
buf_pool_t::mutex further (MDEV-15053).

We will deprecate and ignore the following parameters:

	innodb_buffer_pool_instances
	innodb_page_cleaners

There will be only one buffer pool and one page cleaner task.

In a number of INFORMATION_SCHEMA views, columns that indicated
the buffer pool instance will be removed:

	information_schema.innodb_buffer_page.pool_id
	information_schema.innodb_buffer_page_lru.pool_id
	information_schema.innodb_buffer_pool_stats.pool_id
	information_schema.innodb_cmpmem.buffer_pool_instance
	information_schema.innodb_cmpmem_reset.buffer_pool_instance
2020-02-12 14:45:21 +02:00
Eugene Kosov
691c691adc clean up redo log
main change: rename first redo log without file close

second change: use os_offset_t to represent offset in a file

third change: fix log texts
2020-02-01 23:58:24 +08:00
Marko Mäkelä
50324ce624 MDEV-21351 Replace recv_sys.heap with list of buf_block_t
InnoDB crash recovery used a special type of mem_heap_t that
allocates backing store from the buffer pool. That incurred
a significant overhead, leading to underutilization of memory,
and limiting the maximum contiguous allocated size of a log record.

recv_sys_t::blocks: A linked list of buf_block_t that are allocated
by buf_block_alloc() for redo log records. Replaces recv_sys_t::heap.
We repurpose buf_block_t::unzip_LRU for linking the elements.

recv_sys_t::max_log_blocks: Renamed from recv_n_pool_free_frames.

recv_sys_t::max_blocks(): Accessor for max_log_blocks.

recv_sys_t::alloc(): Allocate memory from the current recv_sys_t::blocks
element, or allocate another block.  In debug builds, various free()
member functions must be invoked, because we repurpose
buf_page_t::buf_fix_count for tracking allocations.

recv_sys_t::free_corrupted_page(): Renamed from recv_recover_corrupt_page()

recv_sys_t::is_memory_exhausted(): Renamed from recv_sys_heap_check()

recv_sys_t::pages and its elements are allocated directly by the
system memory allocator.

recv_parse_log_recs(): Remove the parameter available_memory.

We rename some variables 'store_to_hash' to 'store', because
recv_sys.pages is not actually a hash table.

This is joint work with Thirunarayanan Balathandayuthapani.
2020-01-29 12:53:39 +02:00
Marko Mäkelä
a983b24407 Merge 10.4 into 10.5 2020-01-28 14:17:09 +02:00
Oleksandr Byelkin
bfc24bb2ec Merge branch '10.3' into 10.4 2020-01-24 14:50:23 +01:00
Oleksandr Byelkin
ceda5f724f Merge branch '10.2' into 10.3 2020-01-24 14:16:20 +01:00
Oleksandr Byelkin
f2ccfcaca1 Merge branch '10.1' into 10.2 2020-01-24 13:46:49 +01:00
Julius Goryavsky
982294ac16 MDEV-17601: MariaDB Galera does not expect 'mbstream' as streamfmt
Setting "streamfmt=mbstream" in the "[sst]" section causes SST to fail
because the format automatically switches to 'tar' by default (insead
of mbstream).

To fix this, we need to add mbstream to the list of valid values for
the format, making it synonymous with xbstream. This must be done both
in the SST script and when parsing the options of the corresponding
utilities.
2020-01-21 10:50:48 +01:00
Eugene Kosov
e9de6386ad MDEV-18115 remove now unneeded constraint
log_group_max_size: is not needed because redo log do not use fil_io() now
2020-01-18 23:42:55 +08:00
Sergei Golubchik
ff5a528f26 mysqltest crashes on Debian
Debian is apparently offended that pcre2-posix implements POSIX API,
thus it renames all posix-compatible symbols in libpcre2-posix to have the
PCRE2 prefix. But Debian doesn't do anything to pcre2posix.h header,
so any unaware application will get POSIX compatible type names
and function prototypes from pcre2, but actual symbols will come
from libc.

To remedy this enormous incongruity we have to redefine POSIX-compatible
function names in pcre2posix to match Debian's hack.
2020-01-16 18:13:55 +01:00
Eugene Kosov
562c037b48 MDEV-18115 Remove dummy tablespace for the redo log
Redo log subsystem was decoupled from tablespace subsystem. It now manages file
descriptors for redo log files by itself.

FIL_TYPE_LOG: removed, code in various places was simplified

SRV_LOG_SPACE_FIRST_ID: renamed to SRV_SPACE_ID_UPPER_BOUND
  to better match its purpose. Code in various places was simplified

fil_n_log_flushes: replaced with log_sys::flushes
fil_n_pending_log_flushes: replaced with log_sys::pending_flushes

log_t::files::files: redo log file descriptors
log_t::files::file_names: redo log file names

log_t::files::set_file_names(): set file names without opening them
log_t::files::open_files(): opens redo log files
log_t::files::read(): treats several files as one big
log_t::files::write(): treats several files as one big
log_t::files::fsync(): flushes page cache to disk
log_t::files::close_files(): closes redo log files

fil_open_log_and_system_tablespace_files(): renamed to
  fil_open_system_tablespace_files()
  and obviously it now doesn't open redo log files

global files[1000]: removed. Why it was needed at all?
2020-01-01 22:09:51 +08:00
Marko Mäkelä
8cc15c036d Merge 10.4 into 10.5 2019-12-27 21:17:16 +02:00
Marko Mäkelä
4c25e75ce7 Merge 10.3 into 10.4 2019-12-27 18:20:28 +02:00
Marko Mäkelä
5ab70e7f68 Merge 10.2 into 10.3 2019-12-27 15:14:48 +02:00
Thirunarayanan Balathandayuthapani
bba59abb03 MDEV-19176 Reduce the memory usage during recovery
- Moved the recv_sys->heap memory condition inside recv_parse_log_recs().
So that, InnoDB can mark the status as STORE_NO earlier.

- InnoDB uses one third of buffer pool chunk size for reading the redo
log records. In that case, we can avoid the scenario where buffer ran
out of memory issue during recovery.
2019-12-23 15:51:02 +05:30
Alexey Botchkov
9dadfdcde5 MDEV-14024 PCRE2.
Related changes in the server code.
2019-12-21 10:34:02 +01:00
Marko Mäkelä
28c89b7151 Merge 10.4 into 10.5 2019-12-16 07:47:17 +02:00
Marko Mäkelä
8fa759a576 Merge 10.3 into 10.4
We disable the MDEV-21189 test galera.galera_partition
because it times out.
2019-12-13 17:30:37 +02:00
Marko Mäkelä
0a20e5ab77 Merge 10.2 into 10.3 2019-12-12 14:41:51 +02:00
Vlad Lesin
beec9c0e19 MDEV-21255: Deadlock of parallel slave and mariabackup (with failed log
copy thread)

mariabackup hangs waiting until innodb redo log thread read log till certain
LSN, and it waits under FTWRL. If there is redo log read error in the thread,
it is finished, and main thread knows nothing about it, what leads to hanging.
As it hangs under FTWRL, slave threads on server side can be blocked due
to MDL lock conflict.

The fix is to finish mariabackup with error message on innodb redo log read
failure.
2019-12-12 13:28:30 +03:00
Vladislav Vaintroub
202a62deb0 MDEV-11345 Compile english error messages into mysqld executable.
Simplify loading messages into mariabackup. Do the same as server does
We're forcing english, so there is no attempt to load errmsg.sys
2019-12-11 16:15:40 +01:00
Oleksandr Byelkin
a15234bf4b Merge branch '10.3' into 10.4 2019-12-09 15:09:41 +01:00
Marko Mäkelä
42a4ae54c2 MDEV-21225 Remove ut_align() and use aligned_malloc()
Before commit 90c52e5291 introduced
aligned_malloc(), InnoDB always used a pattern of over-allocating
memory and invoking ut_align() to guarantee the desired alignment.

It is cleaner to invoke aligned_malloc() and aligned_free() directly.

ut_align(): Remove. In assertions, ut_align_down() can be used instead.
2019-12-05 06:42:31 +02:00
Jan Lindström
9d9a2253c6 Merge remote-tracking branch 10.2 into 10.3
Conflicts:
	mysql-test/suite/galera/t/galera_binlog_event_max_size_max-master.opt
	mysql-test/suite/innodb/r/innodb-mdev-7513.result
	mysql-test/suite/innodb/t/innodb-mdev-7513.test
	mysql-test/suite/wsrep/disabled.def
	storage/innobase/ibuf/ibuf0ibuf.cc
2019-12-02 14:35:10 +02:00
Faustin Lammler
2df2238cb8 Lintian complains on spelling error
The lintian check complains on spelling error:
https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
2019-12-02 12:41:13 +02:00
Vlad Lesin
bd11bd63cc MDEV-18310: Aria engine: Undo phase failed with "Got error 121 when
executing undo undo_key_delete" upon startup on datadir restored from
incremental backup

aria_log* files were not copied on --prepare --incremental-dir step from
incremental to destination backup directory.
2019-11-29 17:01:12 +03:00
Marko Mäkelä
312569e2fd MDEV-21132 Remove buf_page_t::newest_modification
At each mini-transaction commit, the log sequence number of the
mini-transaction must be written to each modified page, so that
it will be available in the FIL_PAGE_LSN field when the page is
being read in crash recovery.

InnoDB was unnecessarily allocating redundant storage for the
field, in buf_page_t::newest_modification. Let us access
FIL_PAGE_LSN directly.

Furthermore, on ALTER TABLE...IMPORT TABLESPACE, let us write
0 to FIL_PAGE_LSN instead of using log_sys.lsn.

buf_flush_init_for_writing(), buf_flush_update_zip_checksum(),
fil_encrypt_buf_for_full_crc32(), fil_encrypt_buf(),
fil_space_encrypt(): Remove the parameter lsn.

buf_page_get_newest_modification(): Merge with the only caller.

buf_tmp_reserve_compression_buf(), buf_tmp_page_encrypt(),
buf_page_encrypt(): Define static in the same compilation unit
with the only caller.

PageConverter::m_current_lsn: Remove. Write 0 to FIL_PAGE_LSN
on ALTER TABLE...IMPORT TABLESPACE.
2019-11-25 09:39:51 +02:00
Vladislav Vaintroub
5e62b6a5e0 MDEV-16264 Use threadpool for Innodb background work.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.

- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling

- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.

- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.

- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.

- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
2019-11-15 18:09:30 +01:00
Vladislav Vaintroub
009674dc52 Fix a couple of clang-cl warnings 2019-11-15 15:39:31 +01:00
Marko Mäkelä
52246dff2c Merge 10.4 into 10.5 2019-11-08 09:43:41 +02:00
Marko Mäkelä
78d0d2cdc5 Cleanup: Remove mach_read_ulint()
The function mach_read_ulint() is a wrapper for the lower-level
functions mach_read_from_1(), mach_read_from_2(), mach_read_from_8().
Invoke those functions directly, for better readability of the code.

mtr_t::read_ulint(), mtr_read_ulint(): Remove. Yes, we will lose the
ability to assert that the read is covered by the mini-transaction.
We would still check that on writes, and any writes that
wrongly bypass mini-transaction logging would likely be caught by
stress testing with Mariabackup.
2019-11-08 09:41:06 +02:00
Marko Mäkelä
77e8a311e1 Merge 10.4 into 10.5
A conflict between MDEV-19514 (b42294bc64)
and MDEV-20934 (d7a2401750)
was resolved. We will not invoke the function ibuf_delete_recs()
from ibuf_merge_or_delete_for_page(). Instead, we will add that
logic to the function ibuf_read_merge_pages().
2019-11-07 10:34:33 +02:00
Marko Mäkelä
928abd6967 Merge 10.3 into 10.4 2019-11-06 13:44:56 +02:00
Marko Mäkelä
908ca4668d Merge 10.2 into 10.3 2019-11-06 13:14:31 +02:00
Marko Mäkelä
8688ef22c2 Merge 10.1 to 10.2 2019-11-06 10:18:51 +02:00
Marko Mäkelä
5164f8c206 Fix GCC 9.2.1 -Wstringop-truncation
dict_table_rename_in_cache(): Use strcpy() instead of strncpy(),
because they are known to be equivalent in this case (the length
of old_name was already validated).

mariabackup: Invoke strncpy() with one less than the buffer size,
and explicitly add NUL as the last byte of the buffer.
2019-11-04 15:52:54 +02:00
Marko Mäkelä
88cdfc5c7d MDEV-20487 Set innodb_adaptive_hash_index=OFF by default
Based on the performance testing that was conducted in MDEV-17492,
the InnoDB adaptive hash index could only help performance in specific,
almost-read-only workloads. It could slow down all kinds of workloads
(especially DROP TABLE, TRUNCATE TABLE, ALTER TABLE, or DROP INDEX
operations), and it can become corrupted, causing crashes (such as
MDEV-18815, MDEV-20203) and possibly data corruption. Furthermore,
the adaptive hash index consumes space from the InnoDB buffer pool,
which could hurt performance when the working set would almost fit
in the buffer pool.

Given all this, it is best to disable the adaptive hash index by default.
2019-10-23 17:43:31 +03:00
Sergei Golubchik
f217612fad MDEV-12684 Show what config file a sysvar got a value from
change get_one_option() prototype to pass the filename and
not to pass the redundant optid.
2019-10-14 10:29:30 +02:00
Marko Mäkelä
d04f2de80a Merge 10.4 into 10.5 2019-10-11 08:41:36 +03:00
Marko Mäkelä
c11e5cdd12 Merge 10.3 into 10.4 2019-10-10 11:19:25 +03:00
Marko Mäkelä
892378fb9d Merge 10.2 into 10.3 2019-10-09 13:25:11 +03:00
Thirunarayanan Balathandayuthapani
c65cb244b3 MDEV-19335 Remove buf_page_t::encrypted
The field buf_page_t::encrypted was added in MDEV-8588.
It was made mostly redundant in MDEV-12699. Remove the field.
2019-10-09 13:13:12 +03:00
Vlad Lesin
edda2fd149 MDEV-20703: mariabackup creates binlog files in server binlog directory on --prepare --export step
When "--export" mariabackup option is used, mariabackup starts the server in
bootstrap mode to generate *.cfg files for the certain innodb tables.
The started instance of the server reads options from the file, pointed
out in "--defaults-file" mariabackup option.

If the server uses the same config file as mariabackup, and binlog is
switched on in that config file, then "mariabackup --prepare --export"
will create binary log files in the server's binary log directory, what
can cause issues.

The fix is to add "--skip-log-bin" in mysld options when the server is
started to generate *.cfg files.
2019-10-01 13:57:24 +03:00
Marko Mäkelä
1333da90b5 Merge 10.4 into 10.5 2019-09-24 10:07:56 +03:00
Marko Mäkelä
5a92ccbaea Merge 10.3 into 10.4
Disable MDEV-20576 assertions until MDEV-20595 has been fixed.
2019-09-23 17:35:29 +03:00
Marko Mäkelä
c016ea660e Merge 10.2 into 10.3 2019-09-23 10:25:34 +03:00
Simon Lipp
c0db3fe6da MDEV-18438 Don't stream xtrabackup_info of extra-lsndir 2019-09-19 17:47:54 +03:00
Marko Mäkelä
67ddb6507d Merge 10.4 into 10.5 2019-08-16 14:35:32 +03:00
Marko Mäkelä
1d15a28e52 Merge 10.3 into 10.4 2019-08-14 18:06:51 +03:00
Marko Mäkelä
65d48b4a7b Merge 10.2 to 10.3 2019-08-13 19:28:51 +03:00
Marko Mäkelä
624dd71b94 Merge 10.4 into 10.5 2019-08-13 18:57:00 +03:00
Vlad Lesin
d39d5dd2bc MDEV-20060: Failing assertion: srv_log_file_size <= 512ULL << 30 while preparing backup
The general reason why innodb redo log file is limited by 512G is that
log_block_convert_lsn_to_no() returns value limited by 1G. But there is no
need to have unique log block numbers in log group. The fix removes 512G
limit and limits log group size by
(uint32_t maximum value) * (minimum page size), which, in turns, can be
removed if fil_io() is no longer used for innodb redo log io.
2019-08-07 17:26:44 +03:00
Vladislav Vaintroub
bd917e0811 Fix clang-cl warnings 2019-07-04 10:27:10 +02:00
Marko Mäkelä
7a3d34d645 Merge 10.3 into 10.4 2019-07-02 21:44:58 +03:00
Marko Mäkelä
e82fe21e3a Merge 10.2 into 10.3 2019-07-02 17:46:22 +03:00
Vladislav Vaintroub
6dc71d4f10 improve build, allow sql library to be built in parallel with builtins 2019-06-30 17:48:19 +02:00
Alexander Barkov
3e7e87ddcc MDEV-19897 Rename source code variable names from utf8 to utf8mb3 2019-06-28 12:37:04 +04:00
Vlad Lesin
6d2b236568 MDEV-19865: AddressSanitizer error in open_or_create_log_file() in mariabackup
Decrease array on stack in open_or_create_log_file() to avoid stack overflow.
2019-06-26 11:36:45 +03:00
Eugene Kosov
d36c107a6b imporve clang build
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug

Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about
deprecated `register` keyword.

Too much warnings came from Mroonga and I gave up on it.
2019-06-25 13:21:36 +03:00
Marko Mäkelä
3c88ce4cd1 Merge 10.4 into 10.5 2019-06-18 11:30:06 +03:00
Georg Richter
d13080133f MDEV-14101 Provide an option to select TLS protocol version
Server and command line tools now support option --tls_version to specify the
TLS version between client and server. Valid values are TLSv1.0, TLSv1.1, TLSv1.2, TLSv1.3
or a combination of them. E.g.

--tls_version=TLSv1.3
--tls_version=TLSv1.2,TLSv1.3

In case there is a gap between versions, the lowest version will be used:
--tls_version=TLSv1.1,TLSv1.3 -> Only TLSv1.1 will be available.

If the used TLS library doesn't support the specified TLS version, it will use
the default configuration.

Limitations:

SSLv3 is not supported. The default configuration doesn't support TLSv1.0 anymore.
TLSv1.3 protocol currently is only supported by OpenSSL 1.1.0 (client and server) and
GnuTLS 3.6.5 (client only).

Overview of TLS implementations and protocols

Server:

+-----------+-----------------------------------------+
| Library   | Supported TLS versions                  |
+-----------+-----------------------------------------+
| WolfSSL   | TLSv1.1, TLSv1,2                        |
+-----------+-----------------------------------------+
| OpenSSL   | (TLSv1.0), TLSv1.1, TLSv1,2, TLSv1.3    |
+-----------+-----------------------------------------+
| LibreSSL  | (TLSv1.0), TLSv1.1, TLSv1,2, TLSv1.3    |
+-----------+-----------------------------------------+

Client (MariaDB Connector/C)
+-----------+-----------------------------------------+
| Library   | Supported TLS versions                  |
+-----------+-----------------------------------------+
| GnuTLS    | (TLSv1.0), TLSv1.1, TLSv1.2, TLSv1.3    |
+-----------+-----------------------------------------+
| Schannel  | (TLSv1.0), TLSv1.1, TLSv1.2             |
+-----------+-----------------------------------------+
| OpenSSL   | (TLSv1.0), TLSv1.1, TLSv1,2, TLSv1.3    |
+-----------+-----------------------------------------+
| LibreSSL  | (TLSv1.0), TLSv1.1, TLSv1,2, TLSv1.3    |
+-----------+-----------------------------------------+
2019-06-17 12:26:25 +02:00
Marko Mäkelä
984d7100cd Merge 10.4 into 10.5 2019-06-13 18:36:09 +03:00
Marko Mäkelä
2fd82471ab Merge 10.3 into 10.4 2019-06-12 08:37:27 +03:00
Marko Mäkelä
b42dbdbccd Merge 10.2 into 10.3 2019-06-11 13:00:18 +03:00
Monty
05f8120d0e Fixed compiler warning
Wrong compiler warning from GCC:
‘snprintf’ output 2 or more bytes (assuming 4001)
into a destination of size 4000
2019-06-03 15:06:51 +03:00
Marko Mäkelä
28fad39de7 Merge 10.4 into 10.5 2019-05-29 22:29:05 +03:00
Marko Mäkelä
f98bb23168 Merge 10.3 into 10.4 2019-05-29 22:17:00 +03:00
Marko Mäkelä
90a9193685 Merge 10.2 into 10.3 2019-05-29 11:32:46 +03:00
Marko Mäkelä
c0cd662b98 Merge 10.4 to 10.5 2019-05-27 11:08:51 +03:00
Marko Mäkelä
5d2619b693 MDEV-19584 Allocate recv_sys statically
There is only one InnoDB crash recovery subsystem.
Allocating recv_sys statically removes one level of pointer indirection
and makes code more readable, and removes the awkward initialization of
recv_sys->dblwr.

recv_sys_t::create(): Replaces recv_sys_init().

recv_sys_t::debug_free(): Replaces recv_sys_debug_free().

recv_sys_t::close(): Replaces recv_sys_close().

recv_sys_t::add(): Replaces recv_add_to_hash_table().

recv_sys_t::empty(): Replaces recv_sys_empty_hash().
2019-05-24 16:19:38 +03:00
Vlad Lesin
bff9b8026b MDEV-14192: mariabackup.incremental_backup failed in buildbot with Failing assertion: byte_offset % OS_FILE_LOG_BLOCK_SIZE == 0
In some cases it's possible that InnoDB redo log file header is re-written so,
that checkpoint lsn and checkpoint lsn offset are updated, but checkpoint
number stays the same. The fix is to re-read redo log header if at least one
of those three parametes is changed at backup start.

Repeat the logic of log_group_checkpoint() on choosing InnoDB checkpoint info
field on backup start. This does not influence backup correctness, but
simplifies bugs analysis.
2019-05-24 14:20:03 +03:00
Marko Mäkelä
893472d005 MDEV-19570 Deprecate and ignore innodb_undo_logs, remove innodb_rollback_segments
The option innodb_rollback_segments was deprecated already in
MariaDB Server 10.0. Its misleadingly named replacement innodb_undo_logs
is of very limited use. It makes sense to always create and use the
maximum number of rollback segments.

Let us remove the deprecated parameter innodb_rollback_segments and
deprecate&ignore the parameter innodb_undo_logs (to be removed in a
later major release).

This work involves some cleanup of InnoDB startup. Similar to other
write operations, DROP TABLE will no longer be allowed if
innodb_force_recovery is set to a value larger than 3.
2019-05-23 17:34:47 +03:00
Marko Mäkelä
826f9d4f7e Merge 10.4 into 10.5 2019-05-23 10:32:21 +03:00
Marko Mäkelä
1a6f470464 MDEV-19544 Remove innodb_locks_unsafe_for_binlog
The transaction isolation levels READ COMMITTED and READ UNCOMMITTED
should behave similarly to the old deprecated setting
innodb_locks_unsafe_for_binlog=1, that is, avoid acquiring gap locks.

row_search_mvcc(): Reduce the scope of some variables, and clean up
the initialization and use of the variable set_also_gap_locks.
2019-05-23 10:25:12 +03:00
Marko Mäkelä
47cede646b MDEV-19543 Deprecate and ignore innodb_log_checksums
The parameter innodb_log_checksums that was introduced in MariaDB 10.2.2
via mysql/mysql-server@af0acedd88
does not make much sense. The original motivation of introducing this
parameter (initially called innodb_log_checksum_algorithm in
mysql/mysql-server@22ba38218e)
was that the InnoDB redo log used the slow and insecure innodb algorithm.
With hardware or SIMD accelerated CRC-32C, there should be no reason to
allow checksums to be disabled on the redo log.

The parameter innodb_encrypt_log already implies innodb_log_checksums=ON.

Let us deprecate the parameter innodb_log_checksums and always compute
redo log checksums, even if innodb_log_checksums=OFF is specified.

An upgrade from MariaDB 10.2.2 or later will only be possible after
using the default value innodb_log_checksums=ON. If the non-default
value innodb_log_checksums=OFF was in effect when the server was shut down,
a log block checksum mismatch will be reported and the upgraded server
will fail to start up.
2019-05-23 10:25:11 +03:00
Vladislav Vaintroub
5e4b657dd4 MDEV-18531 : Use WolfSSL instead of YaSSL as "bundled" SSL/encryption library
- Add new submodule for WolfSSL
- Build and use wolfssl and wolfcrypt instead of yassl/taocrypt
- Use HAVE_WOLFSSL instead of HAVE_YASSL
- Increase MY_AES_CTX_SIZE, to avoid compile time asserts in my_crypt.cc
(sizeof(EVP_CIPHER_CTX) is larger on WolfSSL)
2019-05-22 13:48:25 +02:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
cb248f8806 Merge branch '5.5' into 10.1 2019-05-11 22:19:05 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Marko Mäkelä
d3dcec5d65 Merge 10.3 into 10.4 2019-05-05 15:06:44 +03:00
Marko Mäkelä
b132b8895e Merge 10.3 into 10.4 2019-05-05 10:23:14 +03:00
Marko Mäkelä
b6f4cccd19 Merge 10.2 into 10.3 2019-05-03 20:14:09 +03:00
Vladislav Vaintroub
13d7c721a5 MDEV-17008 prepare with datadir, on Windows, does not set ACL
on tablespace files

Fix is to always add Full Control for NetworkService account, for every
file that copyback/moveback copies around.
2019-05-02 19:44:36 +01:00
Vladislav Vaintroub
4b0f010b88 MDEV-18544 "missing required privilege PROCESS on *.*" using mariabackup for SST
If required privilege is missing, dump the output from "SHOW GRANTS"
into mariabackup log.

This will help troubleshooting, and make the bug reproducible.
2019-05-02 14:25:24 +01:00
Marko Mäkelä
447b8ba164 Merge 10.2 into 10.3 2019-04-29 17:54:10 +03:00
Vladislav Vaintroub
e03ad4f71a Fix a typo 2019-04-29 12:40:51 +01:00
Marko Mäkelä
4d59f45260 Merge 10.2 into 10.3 2019-04-27 20:41:31 +03:00
Eugene Kosov
6c5c1f0b2f MDEV-19231 make DB_SUCCESS equal to 0
It's a micro optimization. On most platforms CPUs has instructions to
compare with 0 fast. DB_SUCCESS is the most popular outcome of functions
and this patch optimized code like (err == DB_SUCCESS)

BtrBulk::finish(): bogus assertion fixed

fil_node_t::read_page0(): corrected usage of os_file_read()

que_eval_sql(): bugus assertion removed. Apparently it checked that
the field was assigned after having been zero-initialized at
object creation.

It turns out that the return type of os_file_read_func() was changed
in mysql/mysql-server@98909cefbc (MySQL 5.7)
from ibool to dberr_t. The reviewer (if there was any) failed to
point out that because of future merges, it could be a bad idea to
change the return type of a function without changing the function name.

This change was applied to MariaDB 10.2.2 in
commit 2e814d4702 but the
MariaDB-specific code was not fully adjusted accordingly,
e.g. in fil_node_open_file(). Essentially, code like
!os_file_read(...) became dead code in MariaDB and later
in Mariabackup 10.2, and we could be dealing with an uninitialized
buffer after a failed page read.
2019-04-25 16:29:55 +03:00
Marko Mäkelä
e6bdf77e4b Merge 10.3 into 10.4
In is_eits_usable(), we disable an assertion that fails due to
MDEV-19334.
2019-04-25 16:05:20 +03:00
Marko Mäkelä
acf6f92aa9 Merge 10.2 into 10.3 2019-04-25 09:05:52 +03:00
Vladislav Vaintroub
6b5d3c51b3 Do fast exit with error code and FATAL ERROR message, if innodb cannot start during prepare.
Otherwise, it is possible for "prepare" to exit with error code 0,
even if it did not work at all.
2019-04-23 12:44:09 +01:00
Monty
a024649081 Fixed compiler warnings form gcc 7.3.1 2019-04-19 13:17:14 +03:00
Marko Mäkelä
edd1a53a55 Merge 10.3 into 10.4 2019-04-08 22:00:07 +03:00
Vlad Lesin
05b84b2568 MDEV-14192: Add debug assertions 2019-04-08 21:38:32 +03:00
Marko Mäkelä
9ba0865b87 Merge 10.2 into 10.3 2019-04-08 21:38:13 +03:00
Marko Mäkelä
f120a15b93 MDEV-19212 4GB Limit on large_pages - integer overflow
os_mem_alloc_large(): Invoke the macro ut_2pow_round() with the
correct argument type.

innobase_large_page_size, innobase_use_large_pages,
os_use_large_pages, os_large_page_size: Remove.
Simply refer to opt_large_page_size, my_use_large_pages.
2019-04-08 21:33:49 +03:00
Vlad Lesin
caa8c20abe MDEV-14192 Mariabackup assertion failure: byte_offset % OS_FILE_LOG_BLOCK_SIZE == 0
xtrabackup_backup_func(): If the log checkpoint header changed
since we last read it, search for the most recent checkpoint again.
Otherwise, we could corrupt the backup of the redo log, because the
least significant bits of checkpoint_lsn_start would not match
log_sys->log.lsn.
2019-04-08 21:33:49 +03:00
Marko Mäkelä
d8303c3ee7 Merge 10.3 into 10.4 2019-04-08 08:22:34 +03:00
Marko Mäkelä
cc492bfd4f Merge 10.2 into 10.3 2019-04-07 11:49:50 +03:00
Marko Mäkelä
1d30b7b1d2 MDEV-12699 preparation: Clean up recv_sys
The recv_sys data structures are accessed not only from the thread
that executes InnoDB plugin initialization, but also from the
InnoDB I/O threads, which can invoke recv_recover_page().

Assert that sufficient concurrency control is in place.
Some code was accessing recv_sys data structures without
holding recv_sys->mutex.

recv_recover_page(bpage): Refactor the call from buf_page_io_complete()
into a separate function that performs necessary steps. The
main thread was unnecessarily releasing and reacquiring recv_sys->mutex.

recv_recover_page(block,mtr,recv_addr): Pass more parameters from
the caller. Avoid redundant lookups and computations. Eliminate some
redundant variables.

recv_get_fil_addr_struct(): Assert that recv_sys->mutex is being held.
That was not always the case!

recv_scan_log_recs(): Acquire recv_sys->mutex for the whole duration
of the function. (While we are scanning and buffering redo log records,
no pages can be read in.)

recv_read_in_area(): Properly protect access with recv_sys->mutex.

recv_apply_hashed_log_recs(): Check recv_addr->state only once,
and continuously hold recv_sys->mutex. The mutex will be released
and reacquired inside recv_recover_page() and recv_read_in_area(),
allowing concurrent processing by buf_page_io_complete() in I/O threads.
2019-04-06 21:25:43 +03:00
Marko Mäkelä
0bc4260226 Merge 10.3 into 10.4 2019-03-26 17:43:59 +02:00
Marko Mäkelä
ffc69dbd05 Merge 10.2 into 10.3 2019-03-26 15:03:37 +02:00
Marko Mäkelä
226ca250ed Merge 10.1 into 10.2 2019-03-26 14:17:19 +02:00
FaramosCZ
ed643f4bb3 Fix resource leak
The 'bitmap_dir' has to be closed when no longer needed
2019-03-22 16:19:30 +04:00
FaramosCZ
9c9bf9642e Fix resource leak 2019-03-22 15:48:26 +04:00
FaramosCZ
89e390b7ce Fix resource leak 2019-03-22 15:20:56 +04:00
Marko Mäkelä
ca80e14a88 Merge 10.3 into 10.4 2019-03-22 13:20:44 +02:00
FaramosCZ
9ff713d33a Fix resource leak 2019-03-22 14:52:30 +04:00
Alexander Barkov
1c60f40868 Merge remote-tracking branch 'origin/10.2' into 10.3 2019-03-22 14:41:36 +04:00
Marko Mäkelä
031fa8f1d2 Merge 10.1 into 10.2 2019-03-22 11:15:21 +02:00
Marko Mäkelä
7af0de9fa3 Merge 10.3 into 10.4 2019-03-21 11:22:13 +02:00
Marko Mäkelä
5a780f2ec8 Merge 10.2 into 10.3 2019-03-21 10:14:23 +02:00
Marko Mäkelä
5c1720f8af Remove unused variables from non-debug build 2019-03-21 09:30:37 +02:00
Marko Mäkelä
3d1b6f68f1 Mariabackup: Ensure NUL termination in strncpy() 2019-03-21 08:54:35 +02:00
Marko Mäkelä
514b305dfb Merge 10.3 into 10.4
The MDEV-17262 commit 26432e49d3
was skipped. In Galera 4, the implementation would seem to require
changes to the streaming replication.

In the tests archive.rnd_pos main.profiling, disable_ps_protocol
for SHOW STATUS and SHOW PROFILE commands until MDEV-18974
has been fixed.
2019-03-20 10:41:32 +02:00
Marko Mäkelä
6b6fa3cdb1 MDEV-18644: Support full_crc32 for page_compressed
This is a follow-up task to MDEV-12026, which introduced
innodb_checksum_algorithm=full_crc32 and a simpler page format.
MDEV-12026 did not enable full_crc32 for page_compressed tables,
which we will be doing now.

This is joint work with Thirunarayanan Balathandayuthapani.

For innodb_checksum_algorithm=full_crc32 we change the
page_compressed format as follows:

FIL_PAGE_TYPE: The most significant bit will be set to indicate
page_compressed format. The least significant bits will contain
the compressed page size, rounded up to a multiple of 256 bytes.

The checksum will be stored in the last 4 bytes of the page
(whether it is the full page or a page_compressed page whose
size is determined by FIL_PAGE_TYPE), covering all preceding
bytes of the page. If encryption is used, then the page will
be encrypted between compression and computing the checksum.
For page_compressed, FIL_PAGE_LSN will not be repeated at
the end of the page.

FSP_SPACE_FLAGS (already implemented as part of MDEV-12026):
We will store the innodb_compression_algorithm that may be used
to compress pages. Previously, the choice of algorithm was written
to each compressed data page separately, and one would be unable
to know in advance which compression algorithm(s) are used.

fil_space_t::full_crc32_page_compressed_len(): Determine if the
page_compressed algorithm of the tablespace needs to know the
exact length of the compressed data. If yes, we will reserve and
write an extra byte for this right before the checksum.

buf_page_is_compressed(): Determine if a page uses page_compressed
(in any innodb_checksum_algorithm).

fil_page_decompress(): Pass also fil_space_t::flags so that the
format can be determined.

buf_page_is_zeroes(): Check if a page is full of zero bytes.

buf_page_full_crc32_is_corrupted(): Renamed from
buf_encrypted_full_crc32_page_is_corrupted(). For full_crc32,
we always simply validate the checksum to the page contents,
while the physical page size is explicitly specified by an
unencrypted part of the page header.

buf_page_full_crc32_size(): Determine the size of a full_crc32 page.

buf_dblwr_check_page_lsn(): Make this a debug-only function, because
it involves potentially costly lookups of fil_space_t.

create_table_info_t::check_table_options(),
ha_innobase::check_if_supported_inplace_alter(): Do allow the creation
of SPATIAL INDEX with full_crc32 also when page_compressed is used.

commit_cache_norebuild(): Preserve the compression algorithm when
updating the page_compression_level.

dict_tf_to_fsp_flags(): Set the flags for page compression algorithm.
FIXME: Maybe there should be a table option page_compression_algorithm
and a session variable to back it?
2019-03-18 14:08:43 +02:00
Sergei Golubchik
b64fde8f38 Merge branch '10.2' into 10.3 2019-03-17 13:06:41 +01:00
Sergei Golubchik
f1134d5676 post-merge: gcc 8 warnings
note: Inherit String from Sql_alloc,
to get operators new and new[] in sync

in rocksdb gcc was complaining that non-lvalue was cast to const.
2019-03-15 21:00:50 +01:00
Sergei Golubchik
0508d327ae Merge branch '10.1' into 10.2 2019-03-15 21:00:41 +01:00
Vladislav Vaintroub
cd805a5f1c Cleanup - remove unused variables and functions after MDEV-18917 2019-03-15 11:42:15 +01:00
Vladislav Vaintroub
396cf60ac0 MDEV-18917 Don't create xtrabackup_binlog_pos_innodb with Mariabackup 2019-03-15 11:02:00 +01:00
Marko Mäkelä
5a796f1f41 Merge 10.3 into 10.4 2019-03-08 22:11:37 +02:00
Marko Mäkelä
89b463ee99 Merge 10.2 into 10.3 2019-03-08 22:00:24 +02:00
Marko Mäkelä
ab7e2b048d Merge 10.1 into 10.2 2019-03-08 20:45:45 +02:00
Thirunarayanan Balathandayuthapani
d038806dfe MDEV-18855 Mariabackup should fetch innodb_compression_level from running server
- Fetch innodb_compression_level from the running server.Add the value
of innodb_compression_level in backup-my.cnf file during backup phase.
So that prepare can use the innodb_compression_level variable from
backup-my.cnf
2019-03-08 16:00:08 +05:30
Marko Mäkelä
aa4b2c1509 Merge 10.3 into 10.4 2019-03-07 08:02:33 +02:00
Marko Mäkelä
77103e9832 Merge 10.2 into 10.3 2019-03-06 16:20:13 +02:00
Marko Mäkelä
c155946c90 Merge 10.1 into 10.2 2019-03-06 15:15:59 +02:00
Marko Mäkelä
b761211685 MDEV-18659: Fix string truncation/overflow in InnoDB and XtraDB
Fix the warnings issued by GCC 8 -Wstringop-truncation
and -Wstringop-overflow in InnoDB and XtraDB.

This work is motivated by Jan Lindström. The patch mainly differs
from his original one as follows:

(1) We remove explicit initialization of stack-allocated string buffers.
The minimum amount of initialization that is needed is a terminating
NUL character.
(2) GCC issues a warning for invoking strncpy(dest, src, sizeof dest)
because if strlen(src) >= sizeof dest, there would be no terminating
NUL byte in dest. We avoid this problem by invoking strncpy() with
a limit that is 1 less than the buffer size, and by always writing
NUL to the last byte of the buffer.
(3) We replace strncpy() with memcpy() or strcpy() in those cases
when the result is functionally equivalent.

Note: fts_fetch_index_words() never deals with len==UNIV_SQL_NULL.
This was enforced by an assertion that limits the maximum length
to FTS_MAX_WORD_LEN. Also, the encoding that InnoDB uses for
the compressed fulltext index is not byte-order agnostic, that is,
InnoDB data files that use FULLTEXT INDEX are not portable between
big-endian and little-endian systems.
2019-03-06 11:22:27 +02:00
Marko Mäkelä
2a791c53ad Merge 10.3 into 10.4 2019-03-06 09:00:52 +02:00
Marko Mäkelä
a2fc36989e Merge 10.2 into 10.3 2019-03-04 17:01:00 +02:00
Oleksandr Byelkin
93ac7ae70f Merge branch '10.3' into 10.4 2019-02-21 14:40:52 +01:00
Vladislav Vaintroub
945c748adc MDEV-18669 mariabackup writes timestamp in version line
Use fprintf(stderr) instead of msg() to print the version line
2019-02-21 00:06:08 +01:00
Vladislav Vaintroub
39a2984dc0 Revert "MDEV-18575 Cleanup : remove innodb-encrypt-log parameter from mariabackup"
This reverts commit 3262967008.
It was checked in by mistake
2019-02-20 17:22:44 +01:00
Thirunarayanan Balathandayuthapani
c0f47a4a58 MDEV-12026: Implement innodb_checksum_algorithm=full_crc32
MariaDB data-at-rest encryption (innodb_encrypt_tables)
had repurposed the same unused data field that was repurposed
in MySQL 5.7 (and MariaDB 10.2) for the Split Sequence Number (SSN)
field of SPATIAL INDEX. Because of this, MariaDB was unable to
support encryption on SPATIAL INDEX pages.

Furthermore, InnoDB page checksums skipped some bytes, and there
are multiple variations and checksum algorithms. By default,
InnoDB accepts all variations of all algorithms that ever existed.
This unnecessarily weakens the page checksums.

We hereby introduce two more innodb_checksum_algorithm variants
(full_crc32, strict_full_crc32) that are special in a way:
When either setting is active, newly created data files will
carry a flag (fil_space_t::full_crc32()) that indicates that
all pages of the file will use a full CRC-32C checksum over the
entire page contents (excluding the bytes where the checksum
is stored, at the very end of the page). Such files will always
use that checksum, no matter what the parameter
innodb_checksum_algorithm is assigned to.

For old files, the old checksum algorithms will continue to be
used. The value strict_full_crc32 will be equivalent to strict_crc32
and the value full_crc32 will be equivalent to crc32.

ROW_FORMAT=COMPRESSED tables will only use the old format.
These tables do not support new features, such as larger
innodb_page_size or instant ADD/DROP COLUMN. They may be
deprecated in the future. We do not want an unnecessary
file format change for them.

The new full_crc32() format also cleans up the MariaDB tablespace
flags. We will reserve flags to store the page_compressed
compression algorithm, and to store the compressed payload length,
so that checksum can be computed over the compressed (and
possibly encrypted) stream and can be validated without
decrypting or decompressing the page.

In the full_crc32 format, there no longer are separate before-encryption
and after-encryption checksums for pages. The single checksum is
computed on the page contents that is written to the file.

We do not make the new algorithm the default for two reasons.
First, MariaDB 10.4.2 was a beta release, and the default values
of parameters should not change after beta. Second, we did not
yet implement the full_crc32 format for page_compressed pages.
This will be fixed in MDEV-18644.

This is joint work with Marko Mäkelä.
2019-02-19 18:50:19 +02:00
Marko Mäkelä
fc124778ea Merge 10.2 into 10.3 2019-02-19 17:41:13 +02:00
Marko Mäkelä
ca76fc4a3a MDEV-18611: mariabackup silently ended during xtrabackup_copy_logfile()
log_group_read_log_seg(): Always return false when returning
before reading end_lsn.

xtrabackup_copy_logfile(): On error, indicate whether
a corrupt log record was encountered.

Only xtrabackup_copy_logfile() in Mariabackup cared about the
return value of the function. InnoDB crash recovery was not
affected by this bug.
2019-02-19 11:14:03 +02:00
Vladislav Vaintroub
d2fc9d09da MDEV-18204 - fixup 2019-02-19 07:31:25 +01:00
Vladislav Vaintroub
3a42926c88 MDEV-18204 Fix rocksdb incremental backup
Fix incremental prepare to copy #rocksdb subdirectory from the
incremental dir.
2019-02-18 18:59:05 +01:00
Vladislav Vaintroub
282ba973e7 MDEV-18549 Failing assertion: opt_no_lock during mariabackup --backup
The assertion happens under BACKUP STAGE BLOCK_COMMIT, when a DDL on a
temporary table (#sql-xxx) is found.

Apparently, assumption that all DDLs are blocked under FTWRL does not
hold for BACKUP STAGE, and temporary tables can still have ALTERs

The fix is to relax the assertion, and only check for opt_no_lock if
backup is *really* inconsistent, i.e either optimized DDL or CREATE/RENAME
are done on the tables that were not skipped during backup.
2019-02-14 20:18:34 +01:00
Vladislav Vaintroub
40b4f9c907 MDEV-18575 remove innodb-encrypt-log parameter from mariabackup
The variable is obsolete.
In mariabackup --backup, encryption plugin loading code sets the value
this parameter to the same as in server.

In mariabackup --prepare, no new redo  log is generated,
and xtrabackup_logfile is removed after it anyway.
2019-02-14 11:54:34 +01:00
Vladislav Vaintroub
3262967008 MDEV-18575 Cleanup : remove innodb-encrypt-log parameter from mariabackup 2019-02-14 11:11:16 +01:00
Marko Mäkelä
0a1c3477bf MDEV-18493 Remove page_size_t
MySQL 5.7 introduced the class page_size_t and increased the size of
buffer pool page descriptors by introducing this object to them.

Maybe the intention of this exercise was to prepare for a future
where the buffer pool could accommodate multiple page sizes.
But that future never arrived, not even in MySQL 8.0. It is much
easier to manage a pool of a single page size, and typically all
storage devices of an InnoDB instance benefit from using the same
page size.

Let us remove page_size_t from MariaDB Server. This will make it
easier to remove support for ROW_FORMAT=COMPRESSED (or make it a
compile-time option) in the future, just by removing various
occurrences of zip_size.
2019-02-07 12:21:35 +02:00
Marko Mäkelä
e80bcd7f64 Merge 10.3 into 10.4 2019-02-05 12:48:02 +02:00
Marko Mäkelä
ab2458c61f Merge 10.2 into 10.3 2019-02-04 15:12:14 +02:00
Marko Mäkelä
081fd8bfa2 Merge 10.1 into 10.2 2019-02-02 11:40:02 +02:00
Vladislav Vaintroub
a2641b2611 MDEV-18380 : adjust max_statement_time in mariabackup 2019-02-01 08:53:50 +02:00
Thirunarayanan Balathandayuthapani
f669cecbe3 MDEV-18415 mariabackup.mdev-14447 test case fails with Table 'test.t' doesn't exist in engine
- Added retry logic if validation of first page fails with checksum
mismatch.
2019-02-01 08:53:50 +02:00
Vladislav Vaintroub
6e2af7d084 mariabackup : Remove unused parameter innodb_buffer_pool_size 2019-01-30 09:31:32 +01:00
Marko Mäkelä
78829a5780 Merge 10.3 into 10.4 2019-01-24 22:42:35 +02:00
Marko Mäkelä
947b6b849d Merge 10.2 into 10.3 2019-01-24 16:14:12 +02:00
Geoff Montee
f17c284c57 MDEV-18347: mariabackup doesn't read all server option groups from configuration files 2019-01-24 12:44:55 +01:00
Vladislav Vaintroub
5c159c9037 MDEV-18356 Renamed backup-encrypted option introduced in 7158edcba3
Rename it because it caused parser warning whenever --backup was used.
2019-01-23 12:09:02 +00:00
Brave Galera Crew
36a2a185fe Galera4 2019-01-23 15:30:00 +04:00
Marko Mäkelä
9dc81d2a7a Merge 10.3 into 10.4 2019-01-17 13:21:27 +02:00
Marko Mäkelä
77cbaa96ad Merge 10.2 into 10.3 2019-01-17 12:38:46 +02:00
Vladislav Vaintroub
2153aaf66e mariabackup : use die() macro for fatal exit with error message. 2019-01-16 08:39:49 +01:00
Vladislav Vaintroub
a8a27e65a8 MDEV-18212 mariabackup: Make output format uniform whenever possible 2019-01-15 14:15:04 +01:00
Marko Mäkelä
55a0c3eb6d Merge 10.3 into 10.4 2019-01-15 12:30:29 +02:00
Vladislav Vaintroub
32bb579192 replace hardcoded string "#sql" with tmp_file_prefix 2019-01-14 14:43:24 +01:00
Vladislav Vaintroub
24bc8edcc5 MDEV-18184 Do not copy temporary files (prefixed with #sql-) to backup 2019-01-14 14:25:08 +01:00
Marko Mäkelä
efb510462e Merge 10.2 into 10.3 2019-01-14 14:55:50 +02:00
Marko Mäkelä
248dc12e60 Merge 10.1 into 10.2 2019-01-14 11:37:51 +02:00
FaramosCZ
7372fe4da1 xb_process_datadir(): Fix resource leaks
Closes #983, #984
2019-01-14 11:02:04 +02:00
Vladislav Vaintroub
7331c661db MDEV-18201 : mariabackup- fix processing of rename/create sequence in prepare
Fix one more bug in "DDL redo" phase in prepare
If table was renamed, and then new table was created with the old name,
prepare can be confused, and .ibd can end up with wrong name.

Fix the order of how DDL fixup is applied , once again - ".new" files
should be processed after renames.
2019-01-10 19:35:45 +01:00
Vladislav Vaintroub
4a872ae1e7 MDEV-18185 - mariabackup - fix specific case of table rename handing in prepare.
If, during backup
1) Innodb table is dropped (after being copied to backup) and then
2) Before backup finished, another Innodb table is renamed, and new name
is the name of the dropped table in 1)

then, --prepare fails with assertion, as DDL fixup code in prepare
did not handle this specific case.

The fix is to process drops before renames, in prepare DDL-"redo" phase.
2019-01-09 22:28:31 +01:00
Marko Mäkelä
734510a44d Merge 10.3 into 10.4 2019-01-06 17:43:02 +02:00
Marko Mäkelä
dd43c85c48 Merge 10.2 into 10.3 2019-01-04 02:06:41 +02:00
Marko Mäkelä
23e4446adc Fix a merge error in the parent commit
Fix an inadvertently inverted condition that caused
galera.galera_sst_mariabackup_table_options test failure.
2019-01-04 01:59:46 +02:00
Marko Mäkelä
94e22efb79 Merge 10.2 into 10.3 2019-01-03 17:26:50 +02:00
Marko Mäkelä
b7392d142a Merge 10.1 into 10.2 2019-01-03 16:58:05 +02:00
Marko Mäkelä
7158edcba3 MDEV-18129 Backup fails for encrypted tables: mariabackup: Database page corruption detected at page 1
If an encrypted table is created during backup, then
mariabackup --backup could wrongly fail.

This caused a failure of the test mariabackup.huge_lsn once on buildbot.

This is due to the way how InnoDB creates .ibd files. It would first
write a dummy page 0 with no encryption information. Due to this,
xb_fil_cur_open() could wrongly interpret that the table is not encrypted.
Subsequently, page_is_corrupted() would compare the computed page
checksum to the wrong checksum. (There are both "before" and "after"
checksums for encrypted pages.)

To work around this problem, we introduce a Boolean option
--backup-encrypted that is enabled by default. With this option,
Mariabackup will assume that a nonzero key_version implies that the
page is encrypted. We need this option in order to be able to copy
encrypted tables from MariaDB 10.1 or 10.2, because unencrypted pages
that were originally created before MySQL 5.1.48 could contain nonzero
garbage in the fields that were repurposed for encryption.

Later, MDEV-18128 would clean up the way how .ibd files are created,
to remove the need for this option.

page_is_corrupted(): Add missing const qualifiers, and do not check
space->crypt_data unless --skip-backup-encrypted has been specified.

xb_fil_cur_read(): After a failed page read, output a page dump.
2019-01-03 16:46:38 +02:00
Sergei Golubchik
6bb11efa4a Merge branch '10.2' into 10.3 2019-01-03 13:09:41 +01:00
Marko Mäkelä
cf9070a8f7 Merge 10.1 into 10.2 2018-12-29 23:12:25 +02:00
Marko Mäkelä
50c9469be8 MDEV-18105 Mariabackup fails to copy encrypted InnoDB system tablespace if LSN>4G
This is a regression caused by
commit 8c43f96388
that was part of the MDEV-12112 fixes.

page_is_corrupted(): Never interpret page_no=0 as encrypted.
2018-12-29 22:59:20 +02:00
Marko Mäkelä
0b73b96f9d Merge 10.1 into 10.2 2018-12-29 11:05:26 +02:00
Vladislav Vaintroub
ed66acb291 Silence LeakSanitizer by default in mariabackup, so that phanthom "leaks"
would not hide more interesting information, like invalid memory accesses.


some "leaks" are expected
- partly this is due to weird options parsing, that runs twice, and
does not free memory after the first run.
- also we do not mind to exit()  whenever it makes sense, without full
cleanup.
2018-12-29 02:06:19 +01:00
Marko Mäkelä
cb2f36d3ea MDEV-14975: Cleanup check_all_privileges()
Remove an unused variable, and propagate the error to the caller
instead of calling exit().
2018-12-28 15:33:34 +02:00
Vladislav Vaintroub
773479f5b3 Add test for partial backup for partitioned table. 2018-12-21 16:04:16 +01:00
Marko Mäkelä
b7a9563b21 Merge 10.1 into 10.2 2018-12-21 09:43:35 +02:00
Vladislav Vaintroub
9f4a4cb401 Cleanup recent mariabackup validation patches.
- Refactor code to isolate page validation in page_is_corrupted() function.

- Introduce --extended-validation parameter(default OFF) for mariabackup
--backup to enable decryption of encrypted uncompressed pages during
backup.

- mariabackup would still always check checksum on encrypted data,
it is needed to detect  partially written pages.
2018-12-20 14:31:18 +01:00
Marko Mäkelä
610e4034d7 Merge 10.1 into 10.2 2018-12-19 15:55:55 +02:00
Marko Mäkelä
dd72d7d561 MDEV-18025: Improve test case and consistency checks
Write a test case that computes valid crc32 checksums for
an encrypted page, but zeroes out the payload area, so
that the checksum after decryption fails.

xb_fil_cur_read(): Validate the page number before trying
any checksum calculation or decrypting or decompression.
Also, skip zero-filled pages. For page_compressed pages,
ensure that the FIL_PAGE_TYPE was changed. Also, reject
FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED if no decryption was attempted.
2018-12-19 15:45:35 +02:00
Vladislav Vaintroub
74659e55b7 MDEV-16268 : sporadic checksum mismatch when opening system tablespace in backup
Attempt to fix by retrying srv_sys_space.open_or_create() if it reports
corruption.
2018-12-18 17:33:42 +01:00
Marko Mäkelä
560df47926 Merge 10.1 into 10.2 2018-12-18 16:28:19 +02:00
Thirunarayanan Balathandayuthapani
171271edf8 MDEV-18025 Mariabackup fails to detect corrupted page_compressed=1 tables
Problem:
=======
Mariabackup seems to fail to verify the pages of compressed tables.
The reason is that both fil_space_verify_crypt_checksum() and
buf_page_is_corrupted() will skip the validation for compressed pages.

Fix:
====
Mariabackup should call fil_page_decompress() for compressed and encrypted
compressed page. After that, call buf_page_is_corrupted() to
check the page corruption.
2018-12-18 18:07:17 +05:30
Marko Mäkelä
b5763ecd01 Merge 10.3 into 10.4 2018-12-18 11:33:53 +02:00
Marko Mäkelä
45531949ae Merge 10.2 into 10.3 2018-12-18 09:15:41 +02:00
Marko Mäkelä
7d245083a4 Merge 10.1 into 10.2 2018-12-17 20:15:38 +02:00
Marko Mäkelä
8c43f96388 Follow-up to MDEV-12112: corruption in encrypted table may be overlooked
The initial fix only covered a part of Mariabackup.
This fix hardens InnoDB and XtraDB in a similar way, in order
to reduce the probability of mistaking a corrupted encrypted page
for a valid unencrypted one.

This is based on work by Thirunarayanan Balathandayuthapani.

fil_space_verify_crypt_checksum(): Assert that key_version!=0.
Let the callers guarantee that. Now that we have this assertion,
we also know that buf_page_is_zeroes() cannot hold.
Also, remove all diagnostic output and related parameters,
and let the relevant callers emit such messages.
Last but not least, validate the post-encryption checksum
according to the innodb_checksum_algorithm (only accepting
one checksum for the strict variants), and no longer
try to validate the page as if it was unencrypted.

buf_page_is_zeroes(): Move to the compilation unit of the only callers,
and declare static.

xb_fil_cur_read(), buf_page_check_corrupt(): Add a condition before
calling fil_space_verify_crypt_checksum(). This is a non-functional
change.

buf_dblwr_process(): Validate the page only as encrypted or unencrypted,
but not both.
2018-12-17 19:33:44 +02:00
Vladislav Vaintroub
0a2edddbf4 MDEV-14975 : fix last commit's typo. 2018-12-15 00:06:00 +01:00
Vladislav Vaintroub
5716c71c54 MDEV-14975 mariabackup starts with unprivileged user.
ported privilege checking from xtrabackup.
Now, mariabackup would terminate early if either RELOAD or PROCESS privilege
is not held, not at the very end of backup

The behavior can be disabled with nre setting --check-privileges=0.
Also , --no-lock does not need all of these privileges, since it skips
FTWRL and SHOW ENGINE STATUS INNODB.
2018-12-14 23:36:21 +01:00
Marko Mäkelä
fd37344feb Merge 10.3 into 10.4 2018-12-14 16:20:17 +02:00
Marko Mäkelä
5fefcb0a21 Merge 10.2 into 10.3 2018-12-14 16:15:59 +02:00
Marko Mäkelä
94fa02f4d0 Merge 10.1 into 10.2 2018-12-14 16:11:05 +02:00
Marko Mäkelä
fb252f70c1 MDEV-12112 corruption in encrypted table may be overlooked
After validating the post-encryption checksum on an encrypted page,
Mariabackup should decrypt the page and validate the pre-encryption
checksum as well. This should reduce the probability of accepting
invalid pages as valid ones.

This is a backport and refactoring of a patch that was
originally written by Thirunarayanan Balathandayuthapani
for the 10.2 branch.
2018-12-14 15:44:51 +02:00
Vladislav Vaintroub
3efed7533c Fix lock_ddl_per_table handling in mariabackup.
mariabackup.lock_ddl_per_table was broken, after MDEV-17772 got fixed.
Fix removes MDL waiter killer, it is is not needed anymore.
2018-12-10 14:03:52 +01:00
Vladislav Vaintroub
c0ca164b1c MDEV-17308 mariabackup to use the BACKUP statement instead of FTWRL 2018-12-09 22:12:27 +02:00
Marko Mäkelä
b374246730 Merge 10.3 into 10.4 2018-11-30 11:05:46 +02:00
Marko Mäkelä
0abd2766b1 Merge 10.2 into 10.3
Also, related to MDEV-15522, MDEV-17304, MDEV-17835,
remove the Galera xtrabackup tests, because xtrabackup never worked
with MariaDB Server 10.3 due to InnoDB redo log format changes.
2018-11-30 09:38:56 +02:00
Marko Mäkelä
447e493179 Remove some unnecessary InnoDB #include 2018-11-29 12:53:44 +02:00
Marko Mäkelä
926b04e550 Merge 10.3 into 10.4 2018-11-28 01:18:30 +02:00
Marko Mäkelä
babb000a36 Merge 10.2 into 10.3 2018-11-28 01:02:46 +02:00
Marko Mäkelä
e82e216e37 MDEV-17849 Undo tablespace truncation recovery fails to shrink file
fil_space_t::add(): Replaces fil_node_create(), fil_node_create_low().
Let the caller pass fil_node_t::handle, to avoid having to close and
re-open files.

fil_node_t::read_page0(): Refactored from fil_node_open_file().
Read the first page of a data file.

fil_node_open_file(): Open the file only once.

srv_undo_tablespace_open(): Set the file handle for the opened
undo tablespace. This should ensure that ut_ad(file->is_open())
no longer fails in recv_add_trim().

xtrabackup_backup_func(): Remove some dead code.

xb_fil_cur_open(): Open files only if needed. Undo tablespaces
should already have been opened.
2018-11-27 14:49:39 +02:00
Marko Mäkelä
eb6364619f Remove the redundant variable fil_n_file_opened 2018-11-27 14:30:39 +02:00
Marko Mäkelä
074c684099 Merge 10.3 into 10.4 2018-11-06 16:24:16 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Vladislav Vaintroub
f8268f3cce MDEV-17195 Speedup lock-ddl-per-table, if there is a large number of tables
Query INFORMATION_SCHEMA.INNODB_SYS_TABLES only once, and cache results.

As a small cleanup, remove  mdl_lock_con_mutex, the MDL handling
connection is never accessed by multiple threads at the same time.
2018-10-31 17:19:25 +01:00
Sergei Golubchik
44f6f44593 Merge branch '10.0' into 10.1 2018-10-30 15:10:01 +01:00
Daniel Black
3859273d04 MDEV-14267: correct FSF address 2018-10-30 19:45:09 +08:00
Marko Mäkelä
444c380ceb Merge 10.3 into 10.4 2018-10-05 08:09:49 +03:00
Sergei Golubchik
57e0da50bb Merge branch '10.2' into 10.3 2018-09-28 16:37:06 +02:00
Marko Mäkelä
304857764b MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.
2018-09-25 17:29:43 +03:00
Sergei Golubchik
5ae8fce50b Merge branch '10.1' into 10.2 2018-09-24 11:46:08 +02:00
Marko Mäkelä
8b6b2c3ea1 Fix mariabackup leaks (except my_load_defaults) 2018-09-21 11:18:59 +03:00
Vladislav Vaintroub
c139dc6d38 Windows : Fix application verifier errors.
Make different threads that are running data_copy_thread_func()
use the same pthread_mutex for locking (not the copies of the same
pthread_mutex_t).
2018-09-20 19:27:59 +01:00
Vladislav Vaintroub
0fa35ddf1f Amend fix for MDEV-17236
Simplify, and make it work with system tablespace outside of
innodb data home.

Also, do not reread TRX_SYS page  in endless loop,
if it appears to be corrupted.

Use finite number of attempts.
2018-09-20 14:08:57 +01:00
Thirunarayanan Balathandayuthapani
7e4bbd3aa6 MDEV-17236 mariabackup incorrectly tries to open ibdata1
if custom undo tablespace is defined

- In case of multiple undo tablespace, mariabackup have to open system
tablespace to find the list of undo tablespace present in TRX_SYS page.
For opening system tablespace, mariabackup should fetch the file name
from already initialized system tablespace object.
2018-09-20 17:13:12 +05:30
Marko Mäkelä
65474d92f5 Follow-up to "Fixed wrong printf in mariabackup"
This amends commit 4dc20ff687.

Starting with MariaDB 10.2, InnoDB defines
typedef size_t ulint;

The standard format for size_t uses the z modifier, for example,
"%zu" as in the macro ULINTPF.

"%lu" is wrong for size_t, because sizeof(unsigned long) can be
something else than sizeof(size_t). On Windows, the former would
always be 4 bytes, while size_t would be 4 or 8 bytes.
2018-09-18 06:23:25 +03:00
Michael Widenius
4dc20ff687 Fixed wrong printf in mariabackup 2018-09-16 11:23:27 +03:00
Vladislav Vaintroub
6b2da93359 MDEV-17192 Backup with -no-lock should fail, if DDL is detected at the end of backup 2018-09-14 09:36:02 +01:00
Oleksandr Byelkin
28f08d3753 Merge branch '10.1' into 10.2 2018-09-14 08:47:22 +02:00
Vladislav Vaintroub
e46b2a3e94 MDEV-12956 provide default datadir for mariabackup --copy-back
On Unix, it is compiled-in datadir value.
On Windows, the directory is ..\data, relative to directory
mariabackup.exe

server uses the same logic to determine datadir.
2018-09-11 20:59:35 +01:00
Marko Mäkelä
09af00cbde MDEV-13564: Remove old crash-upgrade logic in 10.4
Stop supporting the additional *trunc.log files that were
introduced via MySQL 5.7 to MariaDB Server 10.2 and 10.3.

DB_TABLESPACE_TRUNCATED: Remove.

purge_sys.truncate: A new structure to track undo tablespace
file truncation.

srv_start(): Remove the call to buf_pool_invalidate(). It is
no longer necessary, given that we no longer access things in
ways that violate the ARIES protocol. This call was originally
added for innodb_file_format, and it may later have been necessary
for the proper function of the MySQL 5.7 TRUNCATE recovery, which
we are now removing.

trx_purge_cleanse_purge_queue(): Take the undo tablespace as a
parameter.

trx_purge_truncate_history(): Rewrite everything mostly in a
single function, replacing references to undo::Truncate.

recv_apply_hashed_log_recs(): If any redo log is to be applied,
and if the log_sys.log.subformat indicates that separately
logged truncate may have been used, refuse to proceed except if
innodb_force_recovery is set. We will still refuse crash-upgrade
if TRUNCATE TABLE was logged. Undo tablespace truncation would
only be logged in undo*trunc.log files, which we are no longer
checking for.
2018-09-11 21:32:15 +03:00
Marko Mäkelä
67fa97dc2c Merge 10.3 into 10.4 2018-09-11 21:31:47 +03:00
Marko Mäkelä
1bf3e8ab43 Merge 10.3 into 10.4 2018-09-11 21:31:03 +03:00
Vladislav Vaintroub
c3124174c3 MDEV-17168 mariabackup reports "failed to open bitmap directory"
MariaDB does not support changed page tracking, since 10.2. Remove bitmap
initialization
2018-09-11 15:24:35 +01:00
Marko Mäkelä
5a1868b58d MDEV-13564 Mariabackup does not work with TRUNCATE
This is a merge from 10.2, but the 10.2 version of this will not
be pushed into 10.2 yet, because the 10.2 version would include
backports of MDEV-14717 and MDEV-14585, which would introduce
a crash recovery regression: Tables could be lost on
table-rebuilding DDL operations, such as ALTER TABLE,
OPTIMIZE TABLE or this new backup-friendly TRUNCATE TABLE.
The test innodb.truncate_crash occasionally loses the table due to
the following bug:

MDEV-17158 log_write_up_to() sometimes fails
2018-09-07 22:15:06 +03:00
Marko Mäkelä
980d1bf1a9 MDEV-14717: Prevent crash-downgrade to earlier MariaDB 10.2
A crash-downgrade of a RENAME (or TRUNCATE or table-rebuilding
ALTER TABLE or OPTIMIZE TABLE) operation to an earlier 10.2 version
would trigger a debug assertion failure during rollback,
in trx_roll_pop_top_rec_of_trx(). In a non-debug build, the
TRX_UNDO_RENAME_TABLE record would be misinterpreted as an
update_undo log record, and typically the file name would be
interpreted as DB_TRX_ID,DB_ROLL_PTR,PRIMARY KEY. If a matching
record would be found, row_undo_mod() would hit ut_error in
switch (node->rec_type). Typically, ut_a(table2 == NULL) would
fail when opening the table from SQL.

Because of this, we prevent a crash-downgrade to earlier MariaDB 10.2
versions by changing the InnoDB redo log format identifier to the
10.3 identifier, and by introducing a subformat identifier so that
10.2 can continue to refuse crash-downgrade from 10.3 or later.
After a clean shutdown, a downgrade to MariaDB 10.2.13 or later would
still be possible thanks to MDEV-14909. A downgrade to older 10.2
versions is only possible after removing the log files (not recommended).

LOG_HEADER_FORMAT_CURRENT: Change to 103 (originally the 10.3 format).

log_group_t: Add subformat. For 10.2, we will use subformat 1,
and will refuse crash recovery from any other subformat of the
10.3 format, that is, a genuine 10.3 redo log.

recv_find_max_checkpoint(): Allow startup after clean shutdown
from a future LOG_HEADER_FORMAT_10_4 (unencrypted only).
We cannot handle the encrypted 10.4 redo log block format,
which was introduced in MDEV-12041. Allow crash recovery from
the original 10.2 format as well as the new format.
In Mariabackup --backup, do not allow any startup from 10.3 or 10.4
redo logs.

recv_recovery_from_checkpoint_start(): Skip redo log apply for
clean 10.3 redo log, but not for the new 10.2 redo log
(10.3 format, subformat 1).

srv_prepare_to_delete_redo_log_files(): On format or subformat
mismatch, set srv_log_file_size = 0, so that we will display the
correct message.

innobase_start_or_create_for_mysql(): Check for format or subformat
mismatch.

xtrabackup_backup_func(): Remove debug assertions that were made
redundant by the code changes in recv_find_max_checkpoint().
2018-09-07 22:10:03 +03:00
Marko Mäkelä
055a3334ad MDEV-13564 Mariabackup does not work with TRUNCATE
Implement undo tablespace truncation via normal redo logging.

Implement TRUNCATE TABLE as a combination of RENAME to #sql-ib name,
CREATE, and DROP.

Note: Orphan #sql-ib*.ibd may be left behind if MariaDB Server 10.2
is killed before the DROP operation is committed. If MariaDB Server 10.2
is killed during TRUNCATE, it is also possible that the old table
was renamed to #sql-ib*.ibd but the data dictionary will refer to the
table using the original name.

In MariaDB Server 10.3, RENAME inside InnoDB is transactional,
and #sql-* tables will be dropped on startup. So, this new TRUNCATE
will be fully crash-safe in 10.3.

ha_mroonga::wrapper_truncate(): Pass table options to the underlying
storage engine, now that ha_innobase::truncate() will need them.

rpl_slave_state::truncate_state_table(): Before truncating
mysql.gtid_slave_pos, evict any cached table handles from
the table definition cache, so that there will be no stale
references to the old table after truncating.

== TRUNCATE TABLE ==

WL#6501 in MySQL 5.7 introduced separate log files for implementing
atomic and crash-safe TRUNCATE TABLE, instead of using the InnoDB
undo and redo log. Some convoluted logic was added to the InnoDB
crash recovery, and some extra synchronization (including a redo log
checkpoint) was introduced to make this work. This synchronization
has caused performance problems and race conditions, and the extra
log files cannot be copied or applied by external backup programs.

In order to support crash-upgrade from MariaDB 10.2, we will keep
the logic for parsing and applying the extra log files, but we will
no longer generate those files in TRUNCATE TABLE.

A prerequisite for crash-safe TRUNCATE is a crash-safe RENAME TABLE
(with full redo and undo logging and proper rollback). This will
be implemented in MDEV-14717.

ha_innobase::truncate(): Invoke RENAME, create(), delete_table().
Because RENAME cannot be fully rolled back before MariaDB 10.3
due to missing undo logging, add some explicit rename-back in
case the operation fails.

ha_innobase::delete(): Introduce a variant that takes sqlcom as
a parameter. In TRUNCATE TABLE, we do not want to touch any
FOREIGN KEY constraints.

ha_innobase::create(): Add the parameters file_per_table, trx.
In TRUNCATE, the new table must be created in the same transaction
that renames the old table.

create_table_info_t::create_table_info_t(): Add the parameters
file_per_table, trx.

row_drop_table_for_mysql(): Replace a bool parameter with sqlcom.

row_drop_table_after_create_fail(): New function, wrapping
row_drop_table_for_mysql().

dict_truncate_index_tree_in_mem(), fil_truncate_tablespace(),
fil_prepare_for_truncate(), fil_reinit_space_header_for_table(),
row_truncate_table_for_mysql(), TruncateLogger,
row_truncate_prepare(), row_truncate_rollback(),
row_truncate_complete(), row_truncate_fts(),
row_truncate_update_system_tables(),
row_truncate_foreign_key_checks(), row_truncate_sanity_checks():
Remove.

row_upd_check_references_constraints(): Remove a check for
TRUNCATE, now that the table is no longer truncated in place.

The new test innodb.truncate_foreign uses DEBUG_SYNC to cover some
race-condition like scenarios. The test innodb-innodb.truncate does
not use any synchronization.

We add a redo log subformat to indicate backup-friendly format.
MariaDB 10.4 will remove support for the old TRUNCATE logging,
so crash-upgrade from old 10.2 or 10.3 to 10.4 will involve
limitations.

== Undo tablespace truncation ==

MySQL 5.7 implements undo tablespace truncation. It is only
possible when innodb_undo_tablespaces is set to at least 2.
The logging is implemented similar to the WL#6501 TRUNCATE,
that is, using separate log files and a redo log checkpoint.

We can simply implement undo tablespace truncation within
a single mini-transaction that reinitializes the undo log
tablespace file. Unfortunately, due to the redo log format
of some operations, currently, the total redo log written by
undo tablespace truncation will be more than the combined size
of the truncated undo tablespace. It should be acceptable
to have a little more than 1 megabyte of log in a single
mini-transaction. This will be fixed in MDEV-17138 in
MariaDB Server 10.4.

recv_sys_t: Add truncated_undo_spaces[] to remember for which undo
tablespaces a MLOG_FILE_CREATE2 record was seen.

namespace undo: Remove some unnecessary declarations.

fil_space_t::is_being_truncated: Document that this flag now
only applies to undo tablespaces. Remove some references.

fil_space_t::is_stopping(): Do not refer to is_being_truncated.
This check is for tablespaces of tables. Potentially used
tablespaces are never truncated any more.

buf_dblwr_process(): Suppress the out-of-bounds warning
for undo tablespaces.

fil_truncate_log(): Write a MLOG_FILE_CREATE2 with a nonzero
page number (new size of the tablespace in pages) to inform
crash recovery that the undo tablespace size has been reduced.

fil_op_write_log(): Relax assertions, so that MLOG_FILE_CREATE2
can be written for undo tablespaces (without .ibd file suffix)
for a nonzero page number.

os_file_truncate(): Add the parameter allow_shrink=false
so that undo tablespaces can actually be shrunk using this function.

fil_name_parse(): For undo tablespace truncation,
buffer MLOG_FILE_CREATE2 in truncated_undo_spaces[].

recv_read_in_area(): Avoid reading pages for which no redo log
records remain buffered, after recv_addr_trim() removed them.

trx_rseg_header_create(): Add a FIXME comment that we could write
much less redo log.

trx_undo_truncate_tablespace(): Reinitialize the undo tablespace
in a single mini-transaction, which will be flushed to the redo log
before the file size is trimmed.

recv_addr_trim(): Discard any redo logs for pages that were
logged after the new end of a file, before the truncation LSN.
If the rec_list becomes empty, reduce n_addrs. After removing
any affected records, actually truncate the file.

recv_apply_hashed_log_recs(): Invoke recv_addr_trim() right before
applying any log records. The undo tablespace files must be open
at this point.

buf_flush_or_remove_pages(), buf_flush_dirty_pages(),
buf_LRU_flush_or_remove_pages(): Add a parameter for specifying
the number of the first page to flush or remove (default 0).

trx_purge_initiate_truncate(): Remove the log checkpoints, the
extra logging, and some unnecessary crash points. Merge the code
from trx_undo_truncate_tablespace(). First, flush all to-be-discarded
pages (beyond the new end of the file), then trim the space->size
to make the page allocation deterministic. At the only remaining
crash injection point, flush the redo log, so that the recovery
can be tested.
2018-09-07 22:10:02 +03:00
Marko Mäkelä
4901f31c13 Merge 10.2 into 10.3 2018-09-07 22:09:28 +03:00
Vladislav Vaintroub
58389c71c2 MDEV-16671 - crash in mariabackup with my.cnf with plugin-load=ha_rocksdb
Remove plugin-load option from mariabackup. It does not needed to be an
option (we only need to store the plugin-load value during backup phase,
and reuse the same value during --prepare).

Fix is to read plugin-load from backup-my.cnf during prepare.
2018-09-07 18:18:49 +01:00
Marko Mäkelä
2f4c391958 Merge 10.2 into 10.3 2018-09-06 22:35:45 +03:00
Vladislav Vaintroub
a0631e7221 MDEV-17149 mariabackup hangs if innodb is not started
Fix exit condition for the log copying thread.
2018-09-06 15:31:29 +01:00
Sergei Golubchik
8bee7c16c8 compiler warnings (clang 4.0.1 on i386)
extra/mariabackup/fil_cur.cc:361:42: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
extra/mariabackup/fil_cur.cc:376:9: warning: format specifies type 'unsigned long' but the argument has type 'ib_int64_t' (aka 'long long') [-Wformat]
sql/handler.cc:6196:45: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
sql/log.cc:1681:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/log.cc:1687:16: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
sql/wsrep_sst.cc:1388:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
sql/wsrep_sst.cc:232:86: warning: format specifies type 'long' but the argument has type 'wsrep_seqno_t' (aka 'long long') [-Wformat]
storage/connect/filamdbf.cpp:450:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/filamdbf.cpp:970:47: warning: format specifies type 'short' but the argument has type 'int' [-Wformat]
storage/connect/inihandl.cpp:197:16: warning: address of array 'key->name' will always evaluate to 'true' [-Wpointer-bool-conversion]
storage/innobase/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/buf/buf0buf.cc:5085:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/innobase/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/handler/ha_innodb.cc:18685:7: warning: format specifies type 'unsigned long' but the argument has type 'wsrep_trx_id_t' (aka 'unsigned long long') [-Wformat]
storage/innobase/row/row0mysql.cc:3319:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/innobase/row/row0mysql.cc:3327:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/maria/ma_norec.c:35:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_norec.c:42:10: warning: implicit conversion from 'int' to 'my_bool' (aka 'char') changes value from 131 to -125 [-Wconstant-conversion]
storage/maria/ma_test2.c:1009:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/maria/ma_test2.c:1010:12: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat]
storage/mroonga/ha_mroonga.cpp:9189:44: warning: use of logical '&&' with constant operand [-Wconstant-logical-operand]
storage/mroonga/vendor/groonga/lib/expr.c:4987:22: warning: comparison of constant -1 with expression of type 'grn_operator' is always false [-Wtautological-constant-out-of-range-compare]
storage/xtradb/btr/btr0scrub.cc:151:17: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/buf/buf0buf.cc:5047:8: warning: nonnull parameter 'bpage' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion]
storage/xtradb/fil/fil0crypt.cc:2454:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3324:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
storage/xtradb/row/row0mysql.cc:3332:5: warning: format specifies type 'long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:120:35: warning: format specifies type 'unsigned long' but the argument has type 'int' [-Wformat]
unittest/sql/mf_iocache-t.cc:96:35: note: expanded from macro 'INFO_TAIL'
2018-09-04 09:19:48 +02:00
Marko Mäkelä
7830fb7f45 Merge 10.2 into 10.3 2018-08-28 12:22:56 +03:00
Marko Mäkelä
d6f7fd6016 MDEV-13564: Refuse MLOG_TRUNCATE in mariabackup
The MySQL 5.7 TRUNCATE TABLE is inherently incompatible
with hot backup, because it is creating and deleting a separate
log file, and it is not writing redo log for all changes of the
InnoDB data dictionary tables. Refuse to create a corrupted backup
if the unsafe form of TRUNCATE was executed.

Note: Undo log tablespace truncation cannot be detected easily.
Also it is incompatible with backup, for similar reasons.

xtrabackup_backup_func(): "Subscribe to" the log events before
the first invocation of xtrabackup_copy_logfile().

recv_parse_or_apply_log_rec_body(): If the function pointer
log_truncate is set, invoke it to report MLOG_TRUNCATE.
2018-08-16 16:10:18 +03:00
Vladislav Vaintroub
f926c5f4fa MDEV-16996 mariabackup --prepare does not use native AIO on Linux by default 2018-08-16 08:37:54 +01:00
Marko Mäkelä
734db318ac Merge 10.3 into 10.4 2018-08-16 10:08:30 +03:00
Marko Mäkelä
8716bb3b72 After-merge fix: Restore DECLARE_THREAD to fix Windows build 2018-08-16 09:41:20 +03:00
Marko Mäkelä
1eb2d8f6e8 Merge 10.2 into 10.3 2018-08-16 08:54:58 +03:00
Vladislav Vaintroub
922e7badfc MDEV-16791 mariabackup : Support DDL commands during backup 2018-08-14 15:11:13 +01:00
Vladislav Vaintroub
1faaaa9718 MDEV-15680 xb_aws_key_management fails in buildbot.
aws_key_management needs current directory to be datadir during
initalization, it scans current directory for encrypted keys.

Fix is to ensure, that plugin initialization in mariabackup happens
after the call to my_setwd(mysql_real_data_home).
2018-08-13 22:39:31 +01:00
Marko Mäkelä
f6d4f624eb MDEV-12041: innodb_encrypt_log key rotation
This will change the InnoDB encrypted redo log format only.
Unencrypted redo log will keep using the MariaDB 10.3 format.
In the new encrypted redo log format, 4 additional bytes will
be reserved in the redo log block trailer for storing the
encryption key version.

For performance reasons, the encryption key rotation
(checking if the latest encryption key version is being used)
is only done at log_checkpoint().

LOG_HEADER_FORMAT_CURRENT: Remove.

LOG_HEADER_FORMAT_ENC_10_4: The encrypted 10.4 format.

LOG_BLOCK_KEY: The encryption key version field.

LOG_BLOCK_TRL_SIZE: Remove.

log_t: Add accessors framing_size(), payload_size(), trailer_offset(),
to be used instead of referring to LOG_BLOCK_TRL_SIZE.

log_crypt_t: An operation passed to log_crypt().

log_crypt(): Perform decryption, encryption, or encryption with key
rotation. Return an error if key rotation at decryption fails.
On encryption, keep using the previous key if the rotation fails.
At startup, old-format encrypted redo log may be written before
the redo log is upgraded (rebuilt) to the latest format.

log_write_up_to(): Add the parameter rotate_key=false.

log_checkpoint(): Invoke log_write_up_to() with rotate_key=true.
2018-08-13 16:04:37 +03:00
Sergei Golubchik
0aa9b03393 Merge branch '10.2' into 10.3 2018-08-12 12:02:23 +02:00
Sergei Golubchik
ba1c05cc0d compiler warning
warning: suggest a space before ‘;’ or explicit braces around empty body in ‘for’ statement
2018-08-09 11:28:38 +02:00
Vladislav Vaintroub
3570ea9189 remove warning on Windows
Replace do-it-yourself version of strdup() with real strdup().
2018-08-05 00:36:59 +01:00
Marko Mäkelä
05459706f2 Merge 10.2 into 10.3 2018-08-03 15:57:23 +03:00
Marko Mäkelä
7b145fae13 mariabackup: Use snprintf() instead of sprintf() 2018-08-03 13:06:35 +03:00
Marko Mäkelä
814ae57daf Merge 10.1 into 10.2 2018-08-03 13:02:56 +03:00
Marko Mäkelä
769f6d2db7 Fix -Wclass-memaccess in WSREP,InnoDB,XtraDB 2018-08-03 12:21:13 +03:00
Marko Mäkelä
ef3070e997 Merge 10.1 into 10.2 2018-08-02 08:19:57 +03:00
Alexander Barkov
e61568ee93 Merge remote-tracking branch 'origin/10.3' into 10.4 2018-07-03 14:02:05 +04:00
Sergei Golubchik
36e59752e7 Merge branch '10.2' into 10.3 2018-06-30 16:39:20 +02:00
Sergei Petrunia
f46acd4a3a Adopt Debian's fix-FTBFS-on-GNU-Hurd.patch.
- Took the original patch by Ondrej Sury;
- Applied a fix for a known problem in the patch:
   https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=882062
- Fixed a few other issues
2018-06-29 14:00:00 +03:00
Vladislav Vaintroub
95ef8de891 mariabackup - rename backup-rocksdb option to rocksdb-backup
to avoid "using unique option prefix 'backup' is error-prone", there is
already --backup option there.
2018-06-22 23:30:26 +01:00
Vladislav Vaintroub
ecc4682672 MDEV-16519 : mariabackup should fail if MDL could not be acquired with lock-ddl-per-table
There is a tiny chance for race condition during MDL acquisition.

If table is renamed just prior to
SELECT 1 FROM <table_name> LIMIT 0

then this query would  fail, yet mariabackup --backup does not handle
it as fatal error and continues, only to fail later during file copy.

The fix is to die on error, of MDL lock query fails.
2018-06-22 15:24:09 +01:00
Marko Mäkelä
0121d5a790 Merge 10.2 into 10.3 2018-06-18 15:43:59 +03:00
Marko Mäkelä
b8514c94f6 MDEV-16496 Mariabackup: Implement --verbose option to instrument InnoDB log apply
srv_print_verbose_log: Introduce the value 2 to refer to
mariabackup --verbose.

recv_recover_page(), recv_parse_log_recs(): Add output for
mariabackup --verbose.
2018-06-15 16:14:12 +03:00
Marko Mäkelä
ff317fe08e Follow-up to MDEV-16367 mariabackup: error: failed to copy enough redo log
Commit dc9c555415 moved the final phase of
the redo log copying to the background thread. This would sometimes cause
too little redo log to be copied at the end of the backup. We would only
guarantee copying up to the latest redo log checkpoint. This would produce
a consistent backup, but it could refer to a too old point of time.

xtrabackup_copy_log(), xtrabackup_copy_logfile(): Add the parameter 'last'.

xtrabackup_backup_low(): Copy any remaining part of the log after the
backup threads have terminated.
2018-06-15 13:31:43 +03:00
Marko Mäkelä
a79b033b35 MDEV-16457 mariabackup 10.2+ should default to innodb_checksum_algorithm=crc32
Since MariaDB Server 10.2.2 (and MySQL 5.7), the default value of
innodb_checksum_algorithm is crc32 (CRC-32C), not the inefficient "innodb"
checksum. Change Mariabackup to use the same default, so that checksum
validation (when using the default algorithm on the server) will take less
time during mariabackup --backup. Also, mariabackup --prepare should be
a little faster, and the server should read backups faster, because the
page checksums would only be validated against CRC-32C.
2018-06-14 14:23:20 +03:00
Vladislav Vaintroub
aba2d7301f MDEV-13122 Backup myrocksdb with mariabackup. 2018-06-07 15:13:54 +01:00
Marko Mäkelä
62d21ddac1 Merge 10.2 into 10.3 2018-06-07 15:07:00 +03:00
Marko Mäkelä
dc9c555415 MDEV-16367 mariabackup: error: failed to copy enough redo log
log_copying_thread(): Keep copying redo log until the end has been
reached. (Previously, we would stop copying as soon as
the first batch of xtrabackup_copy_logfile() returned.)

log_copying: Remove. Use log_copying_running instead.

copy_logfile: Remove. Log copying will now only be invoked from
2 places: from xtrabackup_backup_func() for the initial batch,
and from log_copying_thread() until all of the log has been read.
Use the global variable metadata_to_lsn for determining if the
final part of the log is being copied.

xtrabackup_copy_log(): Add diagnostic messages for terminating
the copying. These messages should be dead code, because
log_group_read_log_seg() should be checking for the same.

xtrabackup_copy_logfile(): Correct the retrying logic.
If anything was successfully read, process the portion that
was read. On failure, let the caller close dst_log_file.

io_watching_thread(): Stop throttling during the last phase
of copying the log (metadata_to_lsn!=0). The final copying
of the log will now be performed in log_copying_thread().

stop_backup_threads(): Clean up the message about stopping
the log copying thread.

xtrabackup_backup_low(): Read metadata_to_lsn from the latest
checkpoint header page, even if it is the first page.
Let the log_copying_thread take care of copying all of
the redo log.
2018-06-07 14:29:35 +03:00
Marko Mäkelä
619c277a6c Mariabackup: Make some globals static 2018-06-07 14:11:55 +03:00
Sergei Golubchik
37659ef43b mysys: remove dead ME_xxx flags 2018-06-04 12:32:22 +02:00
Sergei Golubchik
4ec8598c1d Merge branch 'github/10.2' into 10.3 2018-05-22 11:47:09 +02:00
Sergei Golubchik
3a7d7e180a cleanup: create_temp_file()
simplify. move common code inside, specify common flags inside,
rewrite dead code (`if (mode & O_TEMPORARY)` on Linux, where
`O_TEMPORARY` is always 0) to actually do something.
2018-05-21 16:34:10 +00:00
Vladislav Vaintroub
207e5ba316 mariabackup : Fix race condition when killing query waiting for MDL
Itcan happen that the connection is already gone during the window
between quering I_S.PROCESSLIST and KILL QUERY.

Fix is to tolerate ER_NO_SUCH_THREAD returned from KILL QUERY.

Add small improvement in message "Killing MDL query " to actually output
the query.
Also do not try to kill queries that are already in Killed state.
2018-05-19 17:04:47 +00:00
Sergei Golubchik
c9717dc019 Merge branch '10.2' into 10.3 2018-05-11 13:15:10 +02:00
Sergei Golubchik
9b1824dcd2 Merge branch '10.1' into 10.2 2018-05-10 13:01:42 +02:00
Vladislav Vaintroub
f98496da96 MDEV-16105: Mariabackup does not support SSL
The reason is the missing HAVE_OPENSSL define for mariabackup.
2018-05-08 19:52:08 +00:00
Marko Mäkelä
8bbcc0d505 MDEV-12218: Put back mariabackup --innodb-flush-method
Implement innodb_flush_method as an enum parameter in Mariabackup,
instead of ignoring the option and hard-wiring it to a default value.

xb0xb.h: Remove. Only xtrabackup.cc refers to the enum parameters.

innodb_flush_method_names[], innodb_flush_method_typelib[]:
Define as non-static, so that mariabackup can share the definitions.

srv_file_flush_method: Change the type to ulong, to match the
assignment in init_one_value() and handle_options() in mariabackup.
2018-04-30 18:22:52 +03:00
Marko Mäkelä
baa5a43d8c MDEV-16045: Replace log_group_t with log_t::files
There is only one log_sys and only one log_sys.log.

log_t::files::create(): Replaces log_init().

log_t::files::close(): Replaces log_group_close(), log_group_close_all().

fil_close_log_files(): if (free) log_sys.log_close();
The callers that passed free=true used to call log_group_close_all().

log_header_read(): Replaces log_group_header_read().

log_t::files::file_header_bufs_ptr: Use a single allocation.

log_t::files::file_header_bufs[]: Statically allocate the pointers.

log_t::files::set_fields(): Replaces log_group_set_fields().

log_t::files::calc_lsn_offset(): Replaces log_group_calc_lsn_offset().
Simplify the computation by using fewer variables.

log_t::files::read_log_seg(): Replaces log_group_read_log_seg().

log_sys_t::complete_checkpoint(): Replaces log_io_complete_checkpoint().

fil_aio_wait(): Move the logic from log_io_complete().
2018-04-29 09:46:24 +03:00
Marko Mäkelä
d73a898d64 MDEV-16045: Allocate log_sys statically
There is only one redo log subsystem in InnoDB. Allocate the object
statically, to avoid unnecessary dereferencing of the pointer.

log_t::create(): Renamed from log_sys_init().

log_t::close(): Renamed from log_shutdown().

log_t::checkpoint_buf_ptr: Remove. Allocate log_t::checkpoint_buf
statically.
2018-04-29 09:46:24 +03:00
Marko Mäkelä
715e4f4320 MDEV-12218 Clean up InnoDB parameter validation
Bind more InnoDB parameters directly to MYSQL_SYSVAR and
remove "shadow variables".

innodb_change_buffering: Declare as ENUM, not STRING.

innodb_flush_method: Declare as ENUM, not STRING.

innodb_log_buffer_size: Bind directly to srv_log_buffer_size,
without rounding it to a multiple of innodb_page_size.

LOG_BUFFER_SIZE: Remove.

SysTablespace::normalize_size(): Renamed from normalize().

innodb_init_params(): A new function to initialize and validate
InnoDB startup parameters.

innodb_init(): Renamed from innobase_init(). Invoke innodb_init_params()
before actually trying to start up InnoDB.

srv_start(bool): Renamed from innobase_start_or_create_for_mysql().
Added the input parameter create_new_db.

SRV_ALL_O_DIRECT_FSYNC: Define only for _WIN32.

xb_normalize_init_values(): Merge to innodb_init_param().
2018-04-29 09:41:42 +03:00
Marko Mäkelä
9ed2b2b2b8 Do not divide or multiply by srv_page_size
Instead, shift by srv_page_size_shift.
2018-04-28 20:52:22 +03:00
Marko Mäkelä
a90100d756 Replace univ_page_size and UNIV_PAGE_SIZE
Try to use one variable (srv_page_size) for innodb_page_size.

Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.
2018-04-28 20:45:45 +03:00
Monty
2ccd6716fc Fix a lot of compiler warnings found by -Wunused 2018-04-26 17:35:12 +03:00
Marko Mäkelä
7396dfcca7 Merge 10.2 into 10.3 2018-04-24 20:59:57 +03:00
Marko Mäkelä
4cd7979c56 Merge 10.1 into 10.2 2018-04-24 09:39:45 +03:00
Marko Mäkelä
5b79303b40 MDEV-15988 Crash in ./mtr mariabackup.undo_space_id
xb_assign_undo_space_start(): Correctly pass the length of
the buffer, so that the file name will not be truncated.
2018-04-24 09:19:34 +03:00
Marko Mäkelä
6c64101bf0 MDEV-12266 follow-up fix to Mariabackup
xtrabackup_apply_delta(): Refer to fil_system.sys_space directly.
2018-04-23 13:14:28 +03:00
Marko Mäkelä
c6ba758d1d Merge 10.2 into 10.3 2018-04-23 09:49:58 +03:00
Marko Mäkelä
ea94717983 Merge 10.1 into 10.2 2018-04-21 11:58:32 +03:00
Michael Widenius
ddc5764303 Remove compiler warnings
- Remove unused variables
- Mark variables unused
- Fix wrong types
- Add no-strict-aliasing to BUILD scripts
2018-04-16 20:16:43 +03:00
Vladislav Vaintroub
c2dc72c0c3 MDEV-15779 - mariabackup incremental prepare fails on CIFS mount.
CIFS does not like O_DIRECT flag (it is set successfully, but pread would
fail).

The fix is not to use O_DIRECT, there is not need for it.
posix_fadvise() was used already that should prevent buffer cache
pollution on Linux.

As recommended by documentation of posix_fadvise(), we'll also fsync()
tablespaces after a batch of writes.
2018-04-12 12:09:32 +01:00
Vladislav Vaintroub
15071226a0 MDEV-15780 : Mariabackup with absolute paths in innodb_data_file_path
System tablespace can be specified with absolute path, if innodb_data_home_dir
is empty.

This was not handled well  by mariabackup

1. In backup phase, empty innodb_data_home_dir variable from server was
not recognized. full paths were stored in backup-my.ini, even if
it stored all files locally. thus prepare phase would not find the system
tablespace files.

2. In copy-back phase, copy would not be done to the absolute destination
path, as path would be stripped with trim_dotslash

This patch fixes the above defects.
2018-04-12 11:15:27 +01:00
Vicențiu Ciorbaru
65eefcdc60 Merge remote-tracking branch '10.2' into 10.3 2018-04-12 12:41:19 +03:00
Vladislav Vaintroub
4c7a1a1b9e MDEV-15780 : mariabackup does not handle absolute names in for system tablespaces
Fix 10.2-specific bug - copy-back is not prepared to handle system
tablespaces with absolute path.
2018-04-11 23:25:45 +01:00
Vicențiu Ciorbaru
45e6d0aebf Merge branch '10.1' into 10.2 2018-04-10 17:43:18 +03:00
Vladislav Vaintroub
ecf6675cfc MDEV-15713 mariabackup: throw warning, if --stream is used without --backup 2018-04-09 19:16:50 +01:00
Vladislav Vaintroub
37f24806fc MDEV-15825 Mariabackup help mentions Percona and PXC but not MariaDB 2018-04-09 16:22:15 +01:00
Marko Mäkelä
a5da1c64f8 Merge 10.2 into 10.3 2018-04-04 08:24:57 +03:00
Marko Mäkelä
bc2501453c Remove an unused variable 2018-04-03 16:58:35 +03:00
Thirunarayanan Balathandayuthapani
d9c5a46678 MDEV-15737 assertion in mariabackup.exe!recv_calc_lsn_on_data_add()
- recovered_lsn shouldn't be initialized during xtrabackup_copy_logfile().
If partial redo log read during the end of xtrabackup_copy_logfile() then
recovered_lsn will be different from scanned_lsn. Re-initialization of
recovered_lsn could lead to partial read again. It is a regression of
MDEV-14545
2018-04-03 16:43:36 +05:30
Vladislav Vaintroub
27c24808f7 MDEV-15636 mariabackup --lock-ddl-per-table hangs if ALTER table is running
concurrently.

There is a deadlock between

C1 mariabackup's connection that holds MDL locks
C2 Online ALTER TABLE that wants to have MDL exclusively
   and tries to upgrade its mdl lock.
C3 another mariabackup's connection that does FLUSH TABLES (or FTWRL)

C3 waits waits for C2,  which waits for C1, which waits for C3,
thus the deadlock.


MDL locks cannot be released until FLUSH  succeeds, because
otherwise it would allow ALTER to sneak in, causing backup to abort and
breaking lock-ddl-per-table's promise.

The fix here workarounds the deadlock, by killing connections in
"Waiting for metadata lock" status (i.e ALTER). This killing continues
until FTWRL succeeds.

Killing connections is skipped in case --no-locks parameter
was  passed to backup, because there won't be a FLUSH.

For the reference,in Percona's xtrabackup --lock-ddl-per-connection
silently implies --no-lock ie FLUSH is always skipped there.

A rather large part of fix is introducing DBUG capability to start
a query  the new connection at the right moment of backup
compensating somewhat for mariabackup' lack of send_query or DBUG_SYNC.
2018-04-01 14:26:06 +00:00
Marko Mäkelä
4cad42392a MDEV-12266: Change dict_table_t::space to fil_space_t*
InnoDB always keeps all tablespaces in the fil_system cache.
The fil_system.LRU is only for closing file handles; the
fil_space_t and fil_node_t for all data files will remain
in main memory. Between startup to shutdown, they can only be
created and removed by DDL statements. Therefore, we can
let dict_table_t::space point directly to the fil_space_t.

dict_table_t::space_id: A numeric tablespace ID for the corner cases
where we do not have a tablespace. The most prominent examples are
ALTER TABLE...DISCARD TABLESPACE or a missing or corrupted file.

There are a few functional differences; most notably:
(1) DROP TABLE will delete matching .ibd and .cfg files,
even if they were not attached to the data dictionary.
(2) Some error messages will report file names instead of numeric IDs.

There still are many functions that use numeric tablespace IDs instead
of fil_space_t*, and many functions could be converted to fil_space_t
member functions. Also, Tablespace and Datafile should be merged with
fil_space_t and fil_node_t. page_id_t and buf_page_get_gen() could use
fil_space_t& instead of a numeric ID, and after moving to a single
buffer pool (MDEV-15058), buf_pool_t::page_hash could be moved to
fil_space_t::page_hash.

FilSpace: Remove. Only few calls to fil_space_acquire() will remain,
and gradually they should be removed.

mtr_t::set_named_space_id(ulint): Renamed from set_named_space(),
to prevent accidental calls to this slower function. Very few
callers remain.

fseg_create(), fsp_reserve_free_extents(): Take fil_space_t*
as a parameter instead of a space_id.

fil_space_t::rename(): Wrapper for fil_rename_tablespace_check(),
fil_name_write_rename(), fil_rename_tablespace(). Mariabackup
passes the parameter log=false; InnoDB passes log=true.

dict_mem_table_create(): Take fil_space_t* instead of space_id
as parameter.

dict_process_sys_tables_rec_and_mtr_commit(): Replace the parameter
'status' with 'bool cached'.

dict_get_and_save_data_dir_path(): Avoid copying the fil_node_t::name.

fil_ibd_open(): Return the tablespace.

fil_space_t::set_imported(): Replaces fil_space_set_imported().

truncate_t: Change many member function parameters to fil_space_t*,
and remove page_size parameters.

row_truncate_prepare(): Merge to its only caller.

row_drop_table_from_cache(): Assert that the table is persistent.

dict_create_sys_indexes_tuple(): Write SYS_INDEXES.SPACE=FIL_NULL
if the tablespace has been discarded.

row_import_update_discarded_flag(): Remove a constant parameter.
2018-03-29 22:02:05 +03:00
Marko Mäkelä
05863142ad MDEV-12266: Remove fil_system_t::named_spaces
fil_space_get_by_name(): Remove.
(Implement differently in mariabackup.)

fil_ibd_open(): Check if the tablespace by the same ID already
exists. If it is the same name, return success, else failure.
2018-03-29 20:47:42 +03:00
Marko Mäkelä
2ac8b1a907 MDEV-12266: Add fil_system.sys_space, temp_space
Add fil_system_t::sys_space, fil_system_t::temp_space.
These will replace lookups for TRX_SYS_SPACE or SRV_TMP_SPACE_ID.

mtr_t::m_undo_space, mtr_t::m_sys_space: Remove.

mtr_t::set_sys_modified(): Remove.

fil_space_get_type(), fil_space_get_n_reserved_extents(): Remove.

fsp_header_get_tablespace_size(), fsp_header_inc_size():
Merge to the only caller, innobase_start_or_create_for_mysql().
2018-03-29 19:18:11 +03:00
Marko Mäkelä
600c85e85a Allocate the singleton fil_system statically
fil_system_t::create(): Replaces fil_init(), fsp_init().

fil_system_t::close(): Replaces fil_close().

fil_system_t::max_n_open: Remove. Use srv_max_n_open_files directly.
2018-03-29 19:18:10 +03:00
Sergei Golubchik
b1818dccf7 Merge branch '10.2' into 10.3 2018-03-28 17:31:57 +02:00
Sergei Golubchik
c764bc0a78 Merge branch '10.1' into 10.2 2018-03-25 13:02:52 +02:00
Sergei Golubchik
5fdbc3f66b compiler warning
extra/mariabackup/ds_buffer.c:145:9: warning: pointer targets in assignment differ in signedness [-Wpointer-sign]
2018-03-24 14:17:31 +01:00
Vladislav Vaintroub
af86422f08 MDEV-13023 mariabackup does not preserve holes for page compressed tables.
Changed "local" datasink logic to detect page compressed Innodb tables.

Whenever such table is detected, holes in the copied files are created by
skipping over binary zeros at the end of each compressed page.
2018-03-23 15:30:01 +00:00
Marko Mäkelä
e80a842000 Merge 10.1 into 10.2 2018-03-22 18:02:40 +02:00
Thirunarayanan Balathandayuthapani
b6d68c6aa3 MDEV-13561 Mariabackup is incompatible with retroactively created innodb_undo_tablespaces
- Mariabackup supports starting undo tablespace id which is greater
than 1.
2018-03-22 14:19:16 +05:30
Marko Mäkelä
865cec928a MDEV-15505 Fix wsrep XID storage byte order 2018-03-21 16:29:30 +02:00
Teemu Ollakka
33aad1d273 MDEV-15505 Fixes to compilation without -DWITH_WSREP:BOOL=ON
Removed including wsrep_api.h from service_wsrep.h. This caused
various kinds of collisions with definitions when wsrep is
not supposed to be built in. Defined functions wsrep_xid_seqno()
and wsrep_xid_uuid() in wsrep_dummy.cc. Replaced wsrep_seqno_t
with long long where wsrep_api.h is not included.

Removed wsrep_xid_seqno() macro from wsrep_mysqld.h and made
wsrep code using wsrep_xid_seqno() in handler.cc to be compiled
in only if WITH_WSREP is ON.

Included wsrep_api.h for mariabackup if WITH_WSREP is ON.
2018-03-21 12:02:09 +02:00
Thirunarayanan Balathandayuthapani
6d1d5c3aeb MDEV-14545 Backup fails due to MLOG_INDEX_LOAD record
- Fixed the asan failure of the unsupported_redo test case
2018-03-16 20:55:55 +05:30
Marko Mäkelä
bd7ed1b923 MDEV-13935 INSERT stuck at state Unlocking tables
Revert the dead code for MySQL 5.7 multi-master replication (GCS),
also known as
WL#6835: InnoDB: GCS Replication: Deterministic Deadlock Handling
(High Prio Transactions in InnoDB).

Also, make innodb_lock_schedule_algorithm=vats skip SPATIAL INDEX,
because the code does not seem to be compatible with them.

Add FIXME comments to some SPATIAL INDEX locking code. It looks
like Galera write-set replication might not work with SPATIAL INDEX.
2018-03-16 15:50:04 +02:00
Marko Mäkelä
84129fb1b5 After-merge fix for commit 98eb9518db
The merge only covered 10.1 up to
commit 4d248974e0.

Actually merge the changes up to
commit 0a534348c7.

Also, remove the unused InnoDB field trx_t::abort_type.
2018-03-16 15:49:53 +02:00
Sergey Vojtovich
0a534348c7 MDEV-14265 - RPMLint warning: shared-lib-calls-exit
Eliminated last exit() call from libmysqld.
2018-03-16 13:26:52 +04:00
Thirunarayanan Balathandayuthapani
ff909acfa4 MDEV-14545 Backup fails due to MLOG_INDEX_LOAD record
Problem:
=======
  Mariabackup exits during prepare phase if it encounters
MLOG_INDEX_LOAD redo log record. MLOG_INDEX_LOAD record
informs Mariabackup that the backup cannot be completed based
on the redo log scan, because some information is purposely
omitted due to bulk index creation in ALTER TABLE.

Solution:
========
Detect the MLOG_INDEX_LOAD redo record during backup phase and
exit the mariabackup with the proper error message.
2018-03-13 15:25:38 +05:30
Marko Mäkelä
7fb03d7abf Merge bb-10.2-ext into 10.3 2018-03-13 08:15:06 +02:00
Teemu Ollakka
b125ae0a84 MDEV-15505 New wsrep XID format for backwards compatibility
A new wsrep XID format was added to keep the XID implementation
backwards compatible. Original version always reads XID seqno
part in host byte order, the new version in little endian byte
order. Wsrep XID will always be written in the new format.

Included wsrep_api.h from service_wsrep.h for wsrep type definitions.
Removed redundant wsrep XID code from mariabackup and included
service_wsrep.h in order to use
2018-03-12 14:51:49 +02:00