Commit graph

18765 commits

Author SHA1 Message Date
Marko Mäkelä
d902d43ce7 Merge 10.1 into 10.2 2017-07-06 20:28:08 +03:00
Vladislav Vaintroub
6f1f911497 MDEV-12445 : Rocksdb does not shutdown worker threads and aborts in memleak check on server shutdown
Disable memory leak check in debug server, if rocksdb is loaded.
There is some subtle bug somewhere in 3rd party code we cannot
do much about.

The bug is manifested as follows

Rocksdb does not shutdown worker threads, when plugin is shut down. Thus
OS does not unload the library since there are some active threads using
this library's code. Thus global destructors in the library do not run,
and there is still some memory allocated when server exits.

The workaround disables server's memory leak check, if rocksdb engine was
loaded.
2017-07-06 11:36:17 +00:00
Vladislav Vaintroub
53d6325db0 Fix assertion in rocksb.
Use thd_query_safe() to read query from a different connection.
2017-07-06 11:35:54 +00:00
Marko Mäkelä
2b5c9bc2c8 MDEV-13247 innodb_log_compressed_pages=OFF breaks crash recovery of ROW_FORMAT=COMPRESSED tables
The option innodb_log_compressed_pages was contributed by
Facebook to MySQL 5.6. It was disabled in the 5.6.10 GA release
due to problems that were fixed in 5.6.11, which is when the
option was enabled.

The option was set to innodb_log_compressed_pages=ON by default
(disabling the feature), because safety was considered more
important than speed. The option innodb_log_compressed_pages=OFF
can *CORRUPT* ROW_FORMAT=COMPRESSED tables on crash recovery
if the zlib deflate function is behaving differently (producing
a different amount of compressed data) from how it behaved
when the redo log records were written (prior to the crash recovery).

In MDEV-6935, the default value was changed to
innodb_log_compressed_pages=OFF. This is inherently unsafe, because
there are very many different environments where MariaDB can be
running, using different zlib versions. While zlib can decompress
data just fine, there are no guarantees that different versions will
always compress the same data to the exactly same size. To avoid
problems related to zlib upgrades or version mismatch, we must
use a safe default setting.

This will reduce the write performance for users of
ROW_FORMAT=COMPRESSED tables. If you configure
innodb_log_compressed_pages=ON, please make sure that you will
always cleanly shut down InnoDB before upgrading the server
or zlib.
2017-07-06 14:18:53 +03:00
Marko Mäkelä
0ff62ad808 InnoDB: Remove a redundant condition and an outdated comment
Since MariaDB 10.2.2, temporary table metadata is not written
to the InnoDB data dictionary tables. Therefore,
the DICT_TF2_TEMPORARY flag cannot be set in SYS_TABLES,
except if there exist orphan temporary tables that were created
before MariaDB 10.2.2.

trx_resurrect_table_locks(): Do not skip temporary tables.
If a resurrect transaction modified a temporary table that was
created before MariaDB 10.2.2, that table would be treated
internally as a persistent table. It is safer to resurrect
locks than to skip the table, because the table would be modified
on transaction rollback.
2017-07-06 02:14:33 +03:00
Marko Mäkelä
72a2de92a1 Avoid a hang when InnoDB startup is aborted during redo log apply
buf_flush_page_cleaner_coordinator: In the first loop, use an
appropriate termination condition, waiting for !recv_writer_thread_active.

logs_empty_and_mark_files_at_shutdown(): Signal recv_sys->flush_start
in case the recv_writer_thread was never started, or
buf_flush_page_cleaner_coordinator failed to notice its termination.

innobase_start_or_create_for_mysql(): Remove a redundant, unreachable
condition, and properly release resources when aborting startup due to
recv_sys->found_corrupt_log.
2017-07-05 22:09:28 +03:00
Sergei Golubchik
f6633bf058 Merge branch '10.1' into 10.2 2017-07-05 19:08:55 +02:00
Sergei Golubchik
51d457f371 compilation failures
with -DPLUGIN_PARTITION=NO and -DPLUGIN_PERFSCHEMA=NO
2017-07-05 17:15:59 +02:00
Sergei Golubchik
785e2248bd MDEV-13089 identifier quoting in partitioning
don't print partitioning expression as it was entered by the user,
use Item::print() according to the sql_mode and sql_quote_show_create
2017-07-05 17:15:59 +02:00
Sergei Golubchik
504eff0ca1 cleanup: generate_partition_syntax()
Don't write to a temporary file, use String.
Remove strange one-liner "helpers", use String methods.
Don't use current_thd, don't allocate memory for 1-byte strings, etc.
2017-07-05 17:15:58 +02:00
Sergei Golubchik
47687eef41 MDEV-12936 upgrade to 10.2.6 failed upon tables with virtual columns
when opening 10.1- table that has virtual columns:

1. don't error out if it has vcols over autoinc columns.
   just issue a warning.
2. set vcol type properly
3. in innodb: use table->s->stored_fields instead of table->s->fields,
   because that's what was stored in innodb data dictionary
2017-07-05 17:15:58 +02:00
Sergei Golubchik
7bea860709 MDEV-12923 MyISAM allows CHECK constraint violation in ALTER TABLE
use correct type for Alter_inplace_info flags.
2017-07-05 17:15:57 +02:00
Sergei Golubchik
32e3b02234 MDEV-11639 Server crashes in update_virtual_field, gcol.innodb_virtual_basic fails in buildbot
don't use thd->query_id check in background purge threads
(it doesn't work, because thd->query_id is never incremented there)
instead use thd->open_tables directly, there can be only one table
there anyway, and this is the table opened by this purge thread.
2017-07-05 17:15:57 +02:00
Marko Mäkelä
e555540ab6 MDEV-13105 InnoDB fails to load a table with PAGE_COMPRESSION_LEVEL after upgrade from 10.1.20
When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.

fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.
2017-07-05 14:55:56 +03:00
Marko Mäkelä
e3d3147792 MDEV-13105 InnoDB fails to load a table with PAGE_COMPRESSION_LEVEL after upgrade from 10.1.20
When using innodb_page_size=16k, InnoDB tables
that were created in MariaDB 10.1.0 to 10.1.20 with
PAGE_COMPRESSED=1 and
PAGE_COMPRESSION_LEVEL=2 or PAGE_COMPRESSION_LEVEL=3
would fail to load.

fsp_flags_is_valid(): When using innodb_page_size=16k, use a
more strict check for .ibd files, with the assumption that
nobody would try to use different-page-size files.
2017-07-05 14:35:55 +03:00
Marko Mäkelä
e417af2cb8 MDEV-13143 Server crashes in srv_init_abort_low() when started with inaccessible tmpdir
This is a regression caused by
commit bb60a832ed

srv_shutdown_all_bg_threads(): If os_thread_count indicates that
no threads are running, do not bother checking thread status.
This avoids a crash when InnoDB startup is aborted before
os_aio_init() has been invoked. (os_aio_all_slots_free() would
dereference AIO::s_reads even though it is NULL.)
2017-07-05 12:45:15 +03:00
Marko Mäkelä
8c71c6aa8b MDEV-12548 Initial implementation of Mariabackup for MariaDB 10.2
InnoDB I/O and buffer pool interfaces and the redo log format
have been changed between MariaDB 10.1 and 10.2, and the backup
code has to be adjusted accordingly.

The code has been simplified, and many memory leaks have been fixed.
Instead of the file name xtrabackup_logfile, the file name ib_logfile0
is being used for the copy of the redo log. Unnecessary InnoDB startup and
shutdown and some unnecessary threads have been removed.

Some help was provided by Vladislav Vaintroub.

Parameters have been cleaned up and aligned with those of MariaDB 10.2.

The --dbug option has been added, so that in debug builds,
--dbug=d,ib_log can be specified to enable diagnostic messages
for processing redo log entries.

By default, innodb_doublewrite=OFF, so that --prepare works faster.
If more crash-safety for --prepare is needed, double buffering
can be enabled.

The parameter innodb_log_checksums=OFF can be used to ignore redo log
checksums in --backup.

Some messages have been cleaned up.
Unless --export is specified, Mariabackup will not deal with undo log.
The InnoDB mini-transaction redo log is not only about user-level
transactions; it is actually about mini-transactions. To avoid confusion,
call it the redo log, not transaction log.

We disable any undo log processing in --prepare.

Because MariaDB 10.2 supports indexed virtual columns, the
undo log processing would need to be able to evaluate virtual column
expressions. To reduce the amount of code dependencies, we will not
process any undo log in prepare.

This means that the --export option must be disabled for now.

This also means that the following options are redundant
and have been removed:
	xtrabackup --apply-log-only
	innobackupex --redo-only

In addition to disabling any undo log processing, we will disable any
further changes to data pages during --prepare, including the change
buffer merge. This means that restoring incremental backups should
reliably work even when change buffering is being used on the server.
Because of this, preparing a backup will not generate any further
redo log, and the redo log file can be safely deleted. (If the
--export option is enabled in the future, it must generate redo log
when processing undo logs and buffered changes.)

In --prepare, we cannot easily know if a partial backup was used,
especially when restoring a series of incremental backups. So, we
simply warn about any missing files, and ignore the redo log for them.

FIXME: Enable the --export option.

FIXME: Improve the handling of the MLOG_INDEX_LOAD record, and write
a test that initiates a backup while an ALGORITHM=INPLACE operation
is creating indexes or rebuilding a table. An error should be detected
when preparing the backup.

FIXME: In --incremental --prepare, xtrabackup_apply_delta() should
ensure that if FSP_SIZE is modified, the file size will be adjusted
accordingly.
2017-07-05 11:43:28 +03:00
Marko Mäkelä
dc722559cc Correct a message 2017-07-05 10:16:36 +03:00
Marko Mäkelä
41a6475b49 InnoDB: Use access() instead of open() 2017-07-05 08:02:55 +03:00
Marko Mäkelä
ad2d722acd MDEV-13228: Add a comment that clarifies the constants 2017-07-03 13:13:54 +03:00
Marko Mäkelä
d438a448e6 MDEV-13228 Assertion `n < rec_offs_n_fields(offsets)' failed in rec_get_nth_field_offs upon crash recovery with compressed table
In my preparatory patch for MDEV-12288, there was an off-by-one
array initialization error that affected debug builds.
2017-07-03 12:17:10 +03:00
Marko Mäkelä
56ff6f1b0b InnoDB: Remove dead code for DATA_POINT and DATA_VAR_POINT
The POINT data type is being treated just like any other
geometry data type in InnoDB. The fixed-length data type
DATA_POINT had been introduced in WL#6942 based on a
misunderstanding and without appropriate review.
Because of fundamental design problems (such as a
DEFAULT POINT(0 0) value secretly introduced by InnoDB),
the code was disabled in Oracle Bug#20415831 fix.

This patch removes the dead code and definitions that were
left behind by the Oracle Bug#20415831 patch.
2017-07-03 12:17:10 +03:00
Marko Mäkelä
cf2789bf0b ha_innobase::write_row(): Test the cheaper condition first 2017-07-03 12:17:00 +03:00
Monty
92f1837a27 Fixed compilation warnings (while testing 32 bit builds) 2017-07-01 14:26:42 +03:00
Marko Mäkelä
c436338d9d Assert that DB_TRX_ID must be set on delete-marked records
This is preparation for MDEV-12288, which would set DB_TRX_ID=0
when purging history. Also with that change in place, delete-marked
records must always refer to an undo log record via a nonzero
DB_TRX_ID column. (The DB_TRX_ID is only present in clustered index
leaf page records.)

btr_cur_parse_del_mark_set_clust_rec(), rec_get_trx_id():
Statically allocate the offsets
(should never use the heap). Add some debug assertions.

Replace some use of rec_get_trx_id() with row_get_rec_trx_id().

trx_undo_report_row_operation(): Add some sanity checks that are
common for all operations that produce undo log.
2017-07-01 11:02:58 +03:00
Jacob Mathew
806d4e3127 Run spider mtr suites in 10.1 only on demand. 2017-06-30 16:17:29 -07:00
Jacob Mathew
53235cbb1f Updated spider/bg and spider/handler mtr suites. 2017-06-30 14:58:05 -07:00
Monty
dd8474b1dc Added tmp_disk_table_size to limit size of Aria temp tables in tmpdir
- Added variable tmp_disk_table_size
- Added variable tmp_memory_table_size as an alias for tmp_table_size
- Changed internal variable tmp_table_size to tmp_memory_table_size
- create_info.data_file_length is now set with tmp_disk_table_size
- Fixed that Aria doesn't reset max_data_file_length for internal tables
- Added status flag if table is full so that we can detect this on next insert.
  This ensures that the table is always 'correct', but we get the error one
  row after the row that grow the table too big.
- Removed some mutex lock for internal temporary tables
2017-06-30 22:31:37 +03:00
Marko Mäkelä
e16425ba97 Fix a compiler warning 2017-06-30 18:50:30 +03:00
Vladislav Vaintroub
4877659006 MDEV-12097 : avoid too large memory allocation in innodb buffer pool
on Windows

Align innodb_pool_size up to the innodb_buf_pool_chunk_unit.
2017-06-30 15:35:43 +00:00
Sergei Golubchik
d38b15de65 Merge branch '10.0-galera' into 10.1 2017-06-30 16:33:13 +02:00
Marko Mäkelä
bf262bd957 MDEV-11649 Uninitialized field fts_token->position in innodb_fts.innodb_fts_plugin
The field fts_token->position is not initialized in
row_merge_fts_doc_tokenize(). We cannot have that field
without changing the fulltext parser plugin ABI
(adding st_mysql_ftparser_boolean_info::position,
as it was done in MySQL 5.7 in WL#6943).

The InnoDB fulltext parser plugins "ngram" and "Mecab" that were
introduced in MySQL 5.7 do depend on that field. But the simple_parser
does not. Apparently, simple_parser is leaving the field as 0.

So, in our fix we will assume that the missing position field is 0.
2017-06-30 15:03:53 +03:00
Marko Mäkelä
dd75087993 Fix a compilation warning 2017-06-30 15:03:01 +03:00
Marko Mäkelä
61847b9d80 Tablespace: Add iterator, const_iterator, begin(), end() 2017-06-30 09:31:19 +03:00
Marko Mäkelä
84e4e4506f Reduce the granularity of innodb_log_file_size
In Mariabackup, we would want the backed-up redo log file size to be
a multiple of 512 bytes, or OS_FILE_LOG_BLOCK_SIZE. However, at startup,
InnoDB would be picky, requiring the file size to be a multiple of
innodb_page_size.

Furthermore, InnoDB would require the parameter to be a multiple of
one megabyte, while the minimum granularity is 512 bytes. Because
the data-file-oriented fil_io() API is being used for writing the
InnoDB redo log, writes will for now require innodb_log_file_size to
be a multiple of the maximum innodb_page_size (65536 bytes).

To complicate matters, InnoDB startup divided srv_log_file_size by
UNIV_PAGE_SIZE, so that initially, the unit was bytes, and later it
was innodb_page_size. We will simplify this and keep srv_log_file_size
in bytes at all times.

innobase_log_file_size: Remove. Remove some obsolete checks against
overflow on 32-bit systems. srv_log_file_size is always 64 bits, and
the maximum size 512GiB in multiples of innodb_page_size always fits
in ulint (which is 32 or 64 bits). 512GiB would be 8,388,608*64KiB or
134,217,728*4KiB.

log_init(): Remove the parameter file_size that was always passed as
srv_log_file_size.

log_set_capacity(): Add a parameter for passing the requested file size.

srv_log_file_size_requested: Declare static in srv0start.cc.

create_log_file(), create_log_files(),
innobase_start_or_create_for_mysql(): Invoke fil_node_create()
with srv_log_file_size expressed in multiples of innodb_page_size.

innobase_start_or_create_for_mysql(): Require the redo log file sizes
to be multiples of 512 bytes.
2017-06-29 23:15:05 +03:00
Marko Mäkelä
e903d458bb Clean up InnoDB shutdown
Tablespace::shutdown(): Clear m_path. This was moved from
Tablespace::~Tablespace().

LatchDebug::shutdown(): Remove a redundant condition.
2017-06-29 23:10:46 +03:00
Marko Mäkelä
591edccc93 Simplify access to the binlog offset in InnoDB
trx_sys_print_mysql_binlog_offset(): Use 64-bit arithmetics and ib::info().

TRX_SYS_MYSQL_LOG_OFFSET: Replaces TRX_SYS_MYSQL_LOG_OFFSET_HIGH,
TRX_SYS_MYSQL_LOG_OFFSET_LOW.

trx_sys_update_mysql_binlog_offset(): Remove the constant parameter
field=TRX_SYS_MYSQL_LOG_INFO. Use 64-bit arithmetics.
2017-06-29 23:03:39 +03:00
Marko Mäkelä
aea0e125d2 trx_sys_read_wsrep_checkpoint(): Return whether a checkpoint is present 2017-06-29 23:01:47 +03:00
Marko Mäkelä
cd623508df buf_read_ibuf_merge_pages(): Discard all entries for a missing tablespace
A similar change was contributed to Percona XtraBackup, but for some
reason, it is not present in Percona XtraDB. Since MDEV-9566
(MariaDB 10.1.23), that change is present in the MariaDB XtraDB.
2017-06-29 22:39:06 +03:00
Marko Mäkelä
68b5aeae4e Minor cleanup of InnoDB I/O routines
Change many function parameters from IORequest& to const IORequest&.

Remove an unused definition of ECANCELED.
2017-06-29 22:30:47 +03:00
Marko Mäkelä
859714e73d Simplify up InnoDB redo log system startup and shutdown
recv_sys_init(): Remove the parameter.

recv_sys_create(): Merge to recv_sys_init().

recv_sys_mem_free(): Merge to recv_sys_close().

log_mem_free(): Merge to log_shutdown().
2017-06-29 22:24:48 +03:00
Marko Mäkelä
8143ef1b7c trx_validate_state_before_free(): Add debug assertions 2017-06-29 22:20:34 +03:00
Marko Mäkelä
bb60a832ed Minor cleanup of InnoDB shutdown
os_thread_active(): Remove.

srv_shutdown_all_bg_threads(): Assert that high-level threads
have already exited. Do not sleep if os_thread_count=0.
2017-06-29 22:20:34 +03:00
Vicențiu Ciorbaru
9003869390 Simplify IO_CACHE by removing current_pos and end_pos as self-references
These self references were previously used to avoid having to check the
IO_CACHE's type. However, a benchmark shows that on x86 5930k stock,
the type comparison is marginally faster than the double pointer dereference.
For 40 billion my_b_tell calls, the difference is .1 seconds in favor of performing the
type check. (Basically there is no measurable difference)

To prevent bugs from copying the structure using the equals(=) operator,
and having to do the bookkeeping manually, remove these "convenience"
variables.
2017-06-28 15:23:36 +03:00
Marko Mäkelä
b3171607e3 Avoid InnoDB messages about recovery after creating redo logs
srv_log_files_created: A debug flag to ensure that InnoDB redo log
files can only be created once in the server lifetime, and that
after log files have been created, no crash recovery will take place.

recv_scan_log_recs(): Detect the special case where the log consists
of a sole MLOG_CHECKPOINT record, such as immediately after creating
the redo logs.

recv_recovery_from_checkpoint_start(): Skip the recovery message
if the redo log is logically empty.
2017-06-28 11:58:43 +03:00
Marko Mäkelä
3e1d0ff574 Fix a merge error in commit 8f643e2063
A merge error caused InnoDB bootstrap to fail when
innodb_undo_tablespaces was set to more than 2.
This was because of a bug that was introduced to
srv_undo_tablespaces_init() by the merge.

Furthermore, some adjustments for Oracle Bug#25551311 aka
Bug#23517560 changes were forgotten. We must minimize direct
references to srv_undo_tablespaces_open and use predicates
instead.

srv_undo_tablespaces_init(): Increment srv_undo_tablespaces_open
once, not twice, per loop iteration.

is_system_or_undo_tablespace(): Remove (unused function).

is_predefined_tablespace(): Invoke srv_is_undo_tablespace().
2017-06-27 21:23:12 +03:00
Marko Mäkelä
29624ea304 MDEV-13176 ALTER TABLE…CHANGE col col TIMESTAMP NOT NULL DEFAULT… fails
When it comes to DEFAULT values of columns, InnoDB is imposing both
unnecessary and insufficient conditions on whether ALGORITHM=INPLACE
should be allowed for ALTER TABLE.

When changing an existing column to NOT NULL, any NULL values in the
columns only get a special treatment if the column is changed to an
AUTO_INCREMENT column (which is not supported by ALGORITHM=INPLACE)
or the column type is TIMESTAMP. In all other cases, an error
must be reported for the failure to convert a NULL value to NOT NULL.

InnoDB was unnecessarily interested in whether the DEFAULT value
is not constant when altering other than TIMESTAMP columns. Also,
when changing a TIMESTAMP column to NOT NULL, InnoDB was performing
an insufficient check, and it was incorrectly allowing a constant
DEFAULT value while not being able to replace NULL values with that
constant value.

Furthermore, in ADD COLUMN, InnoDB is unnecessarily rejecting certain
nondeterministic DEFAULT expressions (depending on the session
parameters or the current time).
2017-06-27 07:39:42 +03:00
Sergey Vojtovich
6d0aed42c5 MDEV-12070 - Introduce thd_query_safe() from MySQL 5.7
Merged relevant part of MySQL revision:
565d20b44f
2017-06-26 16:11:22 +04:00
Marko Mäkelä
0d69d313a1 Prevent interleaved error log output on InnoDB startup
buf_flush_page_cleaner_coordinator(): Signal the thread creator
that the error log output regarding setpriority() has been issued.

innobase_start_or_create_for_mysql(): Wait for
buf_flush_page_cleaner_coordinator() to completely start up.

This prevents sporadic failures of tests that search the server
error log for InnoDB redo log recovery messages.
2017-06-23 22:54:42 +03:00
Marko Mäkelä
0288fa619f Do allow writes for innodb_force_recovery=2 or 3
While the primary purpose of innodb_force_recovery is to allow
data to be rescued from an InnoDB instance that would crash due
to some data corruption, the settings 1, 2, or 3 are relatively
safe to use and there is no need to prevent write transactions
in these modes.

The setting innodb_force_recovery=4 and above can cause database
corruption. For those modes, we already set the flag
high_level_read_only to disable modifications, except DROP TABLE.

MODIFICATIONS_NOT_ALLOWED_MSG_FORCE_RECOVERY: Remove. There is no
need to spam the error log for each refused DML operation. It suffices
to return an error to the client. There will be messages at startup
if innodb_read_only or innodb_force_recovery are preventing writes.
2017-06-23 09:54:31 +03:00