Commit graph

19259 commits

Author SHA1 Message Date
Vladislav Vaintroub
0d1e805aeb Fix rocksdb tests on Windows 2017-11-16 22:11:08 +00:00
Vladislav Vaintroub
689168be12 MDEV-13852 - redefine WinWriteableFile such as IsSyncThreadSafe()
is set to true, as it should.

Copy and modify original io_win.h header file to a different location
(as we cannot patch anything in submodule). Make sure modified header is
used.
2017-11-16 18:57:18 +00:00
Jan Lindström
0c4d11e819 MDEV-13206: INSERT ON DUPLICATE KEY UPDATE foreign key fail
This is caused by following change:

commit 95d29c99f01882ffcc2259f62b3163f9b0e80c75
Author: Marko Mäkelä <marko.makela@oracle.com>
Date:   Tue Nov 27 11:12:13 2012 +0200

    Bug#15920445 INNODB REPORTS ER_DUP_KEY BEFORE CREATE UNIQUE INDEX COMPLETED

    There is a phase during online secondary index creation where the index has
    been internally completed inside InnoDB, but does not 'officially' exist yet.
    We used to report ER_DUP_KEY in these situations, like this:

    ERROR 23000: Can't write; duplicate key in table 't1'

    What we should do is to let the 'offending' operation complete, but report an
    error to the
    ALTER TABLE t1 ADD UNIQUE KEY (c2):

    ERROR HY000: Index c2 is corrupted
    (This misleading error message should be fixed separately:
    Bug#15920713 CREATE UNIQUE INDEX REPORTS ER_INDEX_CORRUPT INSTEAD OF DUPLICATE)

    row_ins_sec_index_entry_low(): flag the index corrupted instead of
    reporting a duplicate, in case the index has not been published yet.

    rb:1614 approved by Jimmy Yang

Problem is that after we have found duplicate key on primary key
we continue to get necessary gap locks in secondary indexes to
block concurrent transactions from inserting the searched records.
However, search from unique index used in foreign key constraint
could return DB_NO_REFERENCED_ROW if INSERT .. ON DUPLICATE KEY UPDATE
does not contain value for foreign key column. In this case
we should return the original DB_DUPLICATE_KEY error instead
of DB_NO_REFERENCED_ROW.

Consider as a example following:

create table child(a int not null primary key,
b int not null,
c int,
unique key (b),
foreign key (b) references
parent (id)) engine=innodb;

insert into child values (1,1,2);

insert into child(a) values (1) on duplicate key update c = 3;

Now primary key value 1 naturally causes duplicate key error that will be
stored on node->duplicate. If there was no duplicate key error, we should
return the actual no referenced row error. As value for column b used in
both unique key and foreign key is not provided, server uses 0 as a
search value. This is naturally, not found leading to DB_NO_REFERENCED_ROW.
But, we should update the row with primay key value 1 anyway as
requested by on duplicate key update clause.
2017-11-16 11:05:24 +02:00
Jun Su
2401d14e6b Support CRC32 SSE2 implementation under Windows 2017-11-16 09:59:18 +01:00
Marko Mäkelä
5e7435c4b0 Correct a merge error (remove a bogus #ifdef) 2017-11-15 08:53:31 +02:00
Vladislav Vaintroub
f48d56c459 MDEV-14192 Apply marko's patch 2017-11-13 14:58:18 +00:00
Marko Mäkelä
b2dd5232d4 InnoDB: Remove ut_vsnprintf() and the use of my_vsnprintf(); use vsnprintf() 2017-11-13 04:32:56 +02:00
Marko Mäkelä
c19ef508b8 InnoDB: Remove ut_snprintf() and the use of my_snprintf(); use snprintf() 2017-11-13 02:11:48 +02:00
Marko Mäkelä
17bd6ed29a Remove STATUS_VERBOSE (there is no visible output) 2017-11-12 17:27:36 +02:00
Marko Mäkelä
fa00fedaac MDEV-14100 Assertion `!is_user_rec || leaf || ...
rec_get_offsets_func(): Relax a bogus debug assertion.
It would fail when we are operating on a copied prefix
of a node pointer record.
2017-11-10 15:58:52 +02:00
Marko Mäkelä
9618c04e3f Follow-up fix of MDEV-13795/MDEV-14332
row_log_table_apply_op(): Remove references to dict_table_t::n_vcols.
Virtual column information is no longer being written to the log.

row_log_t: Remove the unused fields n_old_col, n_old_vcol.
2017-11-10 15:58:52 +02:00
Marko Mäkelä
5d142f9958 MDEV-13795/MDEV-14332 Corruption during online table-rebuilding ALTER when VIRTUAL columns exist
When MySQL 5.7 introduced indexed virtual columns, it introduced
several bugs into the online table-rebuilding ALTER, that is,
the row_log_table_apply() family of functions.

The online_log format that was introduced for online table-rebuilding
ALTER in MySQL 5.6 should be sufficient. Ideally, any indexed virtual
column values would be evaluated based on the log records in the temporary
file. There is no need to log virtual column values.

(For ADD INDEX, that is row_log_apply(), we always must log the values of
the keys, no matter if the columns are virtual.)

Because omitting the virtual column values removes any chance of
row_log_table_apply() working with indexed virtual columns, we
will for now refuse LOCK=NONE in table-rebuilding ALTER operations
when indexes on virtual columns exist. This restriction would be
lifted in MDEV-14341.

innobase_indexed_virtual_exist(): New predicate, to determine if
indexed virtual columns exist in a table definition.

ha_innobase::check_if_supported_inplace_alter(): Refuse online rebuild
if indexed virtual columns exist.

rec_get_converted_size_temp_v(), rec_convert_dtuple_to_temp_v(): Remove.

row_log_table_delete(), row_log_table_update(, row_log_table_insert():
Remove parameters for virtual columns.

trx_undo_read_v_rows(): Remove the col_map parameter.

row_log_table_apply(): Do not deal with virtual columns.
2017-11-09 23:39:12 +02:00
Sergei Petrunia
e2376e8137 MDEV-14334: Update test results for rocksdb.bulk_load_rev_data 2017-11-09 19:52:55 +03:00
Monty
d40c23570f Cleanup up after failed alter table in add_index_inplace_crash 2017-11-09 14:58:26 +02:00
Marko Mäkelä
761cf49265 Merge 10.1 into 10.2 2017-11-09 14:45:39 +02:00
Marko Mäkelä
d2ffafe00f MDEV-14333 Mariabackup --apply-log-only crashes if incomplete transactions with update_undo logs are present
trx_undo_free_prepared(): Relax the assertion for
mariabackup --apply-log-only.
2017-11-09 14:37:03 +02:00
Marko Mäkelä
7c85a8d936 Merge 10.1 into 10.2 2017-11-08 13:12:11 +02:00
Marko Mäkelä
644ffdeb92 Fix integer type mismatch in WSREP debug output 2017-11-08 09:26:46 +02:00
Marko Mäkelä
843e4508c0 Merge 10.1 into 10.2 2017-11-07 23:02:39 +02:00
Marko Mäkelä
a4feb04ace MDEV-14310 ALTER TABLE…ADD INDEX may corrupt the InnoDB system tablespace
FlushObserver::flush(): Never discard unwritten changes.
We do not want to risk corrupting the system tablespace
or .ibd files that are not part of a table-rebuilding ALTER.
2017-11-07 23:00:51 +02:00
Marko Mäkelä
d04c4b3905 MDEV-14304 Unnecessary conditions in buf_page_get_gen()
Ever since MDEV-10813 cleaned up InnoDB use of atomic memory operations
and made buf_block_fix() an atomic operation, some conditions around
buf_block_fix() have been unnecessary.
2017-11-07 22:45:41 +02:00
Marko Mäkelä
f830314fd5 Remove dead code for non-debug builds 2017-11-06 22:35:03 +02:00
Vladislav Vaintroub
40bae98c3d MDEV-12108 Fix backup for Innodb tables with DATA DIRECTORY 2017-11-06 19:21:23 +00:00
Marko Mäkelä
5691109689 Merge 10.0 into 10.1 2017-11-06 18:10:23 +02:00
Marko Mäkelä
51b4366bfb MDEV-13328 ALTER TABLE…DISCARD TABLESPACE takes a lot of time
With a big buffer pool that contains many data pages,
DISCARD TABLESPACE took a long time, because it would scan the
entire buffer pool to remove any pages that belong to the tablespace.
With a large buffer pool, this would take a lot of time, especially
when the table-to-discard is empty.

The minimum amount of work that DISCARD TABLESPACE must do is to
remove the pages of the to-be-discarded table from the
buf_pool->flush_list because any writes to the data file must be
prevented before the file is deleted.

If DISCARD TABLESPACE does not evict the pages from the buffer pool,
then IMPORT TABLESPACE must do it, because we must prevent pre-DISCARD,
not-yet-evicted pages from being mistaken for pages of the imported
tablespace.

It would not be a useful fix to simply move the buffer pool scan to
the IMPORT TABLESPACE step. What we can do is to actively evict those
pages that could be mistaken for imported pages. In this way, when
importing a small table into a big buffer pool, the import should
still run relatively fast.

Import is bypassing the buffer pool when reading pages for the
adjustment phase. In the adjustment phase, if a page exists in
the buffer pool, we could replace it with the page from the imported
file. Unfortunately I did not get this to work properly, so instead
we will simply evict any matching page from the buffer pool.

buf_page_get_gen(): Implement BUF_EVICT_IF_IN_POOL, a new mode
where the requested page will be evicted if it is found. There
must be no unwritten changes for the page.

buf_remove_t: Remove. Instead, use trx!=NULL to signify that a write
to file is desired, and use a separate parameter bool drop_ahi.

buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Replace buf_remove_t.

buf_LRU_remove_pages(), buf_LRU_remove_all_pages(): Remove.

PageConverter::m_mtr: A dummy mini-transaction buffer

PageConverter::PageConverter(): Complete the member initialization list.

PageConverter::operator()(): Evict any 'shadow' pages from the
buffer pool so that pre-existing (garbage) pages cannot be mistaken
for pages that exist in the being-imported file.

row_discard_tablespace(): Remove a bogus comment that seems to
refer to IMPORT TABLESPACE, not DISCARD TABLESPACE.
2017-11-06 18:08:33 +02:00
Marko Mäkelä
57ba66b9ab Remove redundant function parameters
buf_flush_or_remove_pages(), buf_flush_dirty_pages(): Remove the
redundant parameter flush=(trx!=NULL).
2017-11-06 18:08:33 +02:00
Marko Mäkelä
6a524fcfdd MDEV-14140 IMPORT TABLESPACE must not go beyond FSP_FREE_LIMIT
ibuf_check_bitmap_on_import(): Only access the pages that
are below FSP_FREE_LIMIT. It is possible that especially with
ROW_FORMAT=COMPRESSED, the FSP_SIZE will be much bigger than
the FSP_FREE_LIMIT, and the bitmap pages (page_size*N, 1+page_size*N)
are filled with zero bytes.

buf_page_is_corrupted(), buf_page_io_complete(): Make the
fault injection compatible with MariaDB 10.2.

Backport the IMPORT tests from 10.2.
2017-11-06 14:55:34 +02:00
Marko Mäkelä
51679e5c38 MDEV-14132 InnoDB page corruption
On some old GNU/Linux systems, invoking posix_fallocate() with
offset=0 would sometimes cause already allocated bytes in the
data file to be overwritten.

Fix a correctness regression that was introduced in
commit 420798a81a
by invoking posix_fallocate() in a safer way.
A similar change was made in MDEV-5746 earlier.

os_file_get_size(): Avoid changing the state of the file handle,
by invoking fstat() instead of lseek().

os_file_set_size(): Determine the current size of the file
by os_file_get_size(), and then extend the file from that point
onwards.
2017-11-06 08:53:51 +02:00
Marko Mäkelä
30a8764b92 MDEV-14244 MariaDB fails to run with O_DIRECT
os_file_set_size(): If posix_fallocate() returns EINVAL, fall back
to writing zero bytes to the file. Also, remove some error log output,
and make it possible for a server shutdown to interrupt the fall-back
code.

MariaDB used to ignore any possible return value from posix_fallocate()
ever since innodb_use_fallocate was introduced in MDEV-4338. If EINVAL
was returned, the file would not be extended.

Starting with MDEV-11520, MariaDB would treat EINVAL as a hard error.

Why is the EINVAL returned? The GNU posix_fallocate() function
would first try the fallocate() system call, which would return
-EOPNOTSUPP for many file systems (notably, not ext4). Then, it
would fall back to extending the file one block at a time by invoking
pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of
the file block size. This would fail with EINVAL if the file is in
O_DIRECT mode, because O_DIRECT requires aligned operation.
2017-11-06 08:53:50 +02:00
Sergei Petrunia
8f2e8cf0cb Merge branch '10.2' of github.com:MariaDB/server into bb-10.2-mariarocks 2017-11-04 12:38:17 +03:00
Vladislav Vaintroub
b0cfb16867 Fix a warning. 2017-11-02 17:48:50 +00:00
Sergei Petrunia
80d61515ac Make rocksdb.read_only_tx pass and enable it
- FB/MySQL 5.6' MyRocks has START TRANSACTION WITH CONSISTENT
  ROCKSDB SNAPSHOT, which returns binlog position.

- MariaDB has a cross-engine START TRANSACTION WITH CONSISTENT
  SNAPSHOT. It can be used for the same purpose. Binlog position
  can be obtained from Binlog_snapshot_file/position status vars.
2017-11-02 16:15:37 +00:00
Marko Mäkelä
19733efa7b MDEV-14244 MariaDB 10.2.10 fails to run on Debian Stretch with ext3 and O_DIRECT
os_file_set_size(): If posix_fallocate() returns EINVAL, fall back
to writing zero bytes to the file. Also, remove some error log output,
and make it possible for a server shutdown to interrupt the fall-back
code.

MariaDB 10.2 used to handle the EINVAL return value from posix_fallocate()
before commit b731a5bcf2
which refactored os_file_set_size() to try posix_fallocate().

Why is the EINVAL returned? The GNU posix_fallocate() function
would first try the fallocate() system call, which would return
-EOPNOTSUPP for many file systems (notably, not ext4). Then, it
would fall back to extending the file one block at a time by invoking
pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of
the file block size. This would fail with EINVAL if the file is in
O_DIRECT mode, because O_DIRECT requires aligned operation.
2017-11-02 16:18:41 +02:00
Monty
0f4e005541 Fixed compiler warning and warning from valgrind
The failing test was main.gis-json
2017-11-02 15:40:27 +02:00
Sergei Golubchik
c4c48e9740 MDEV-11965 -Werror should not appear in released tarballs 2017-11-02 06:32:20 +00:00
Marko Mäkelä
6692b5f74a Merge 10.1 into 10.2 2017-11-01 09:55:00 +02:00
Marko Mäkelä
892cf2de13 Merge 10.0 into 10.1 2017-10-31 09:11:31 +02:00
Marko Mäkelä
88edb1b3ed MDEV-14219 Allow online table rebuild when encryption or compression parameters change
When MariaDB 10.1.0 introduced table options for encryption and
compression, it unnecessarily changed
ha_innobase::check_if_supported_inplace_alter() so that ALGORITHM=COPY
is forced when these parameters differ.

A better solution is to move the check to innobase_need_rebuild().
In that way, the ALGORITHM=INPLACE interface (yes, the syntax is
very misleading) can be used for rebuilding the table much more
efficiently, with merge sort, with no undo logging, and allowing
concurrent DML operations.
2017-10-31 09:10:25 +02:00
Marko Mäkelä
d11001d11b Backport MDEV-13890 from 10.2 (InnoDB/XtraDB shutdown failure)
If InnoDB or XtraDB recovered committed transactions at server
startup, but the processing of recovered transactions was
prevented by innodb_read_only or by innodb_force_recovery,
an assertion would fail at shutdown.

This bug was originally reproduced when Mariabackup executed
InnoDB shutdown after preparing (applying redo log into) a backup.

trx_free_prepared(): Allow TRX_STATE_COMMITTED_IN_MEMORY.

trx_undo_free_prepared(): Allow any undo log state. For transactions
that were resurrected in TRX_STATE_COMMITTED_IN_MEMORY
the undo log state would have been reset by trx_undo_set_state_at_finish().
2017-10-30 18:43:16 +02:00
Marko Mäkelä
58e0dcb93d Add a missing space to an error message 2017-10-30 10:06:47 +02:00
Elena Stepanova
a269173e97 Workaround for MDEV-13852 (tests don't run on Windows) 2017-10-30 03:24:35 +02:00
Sergei Petrunia
7cca0df0d7 Fix rocksdb.rocksdb test
Forgot to put the updated rocksdb.result in
2017-10-29 22:55:51 +03:00
Sergei Petrunia
e5678c3fac MDEV-13904: rocksdb.add_index_inplace_sstfilewriter timed out
Downscale rocksdb.add_index_inplace_sstfilewriter to be 10x smaller
2017-10-29 13:21:23 +03:00
Sergei Petrunia
34188ac455 Organize information in storage/rocksdb/mysql-test/rocksdb/t/disabled.def 2017-10-29 09:41:40 +00:00
Sergei Petrunia
a6dc22fc73 MyRocks: enable a few tests that do not seem to fail anymore 2017-10-29 11:49:18 +03:00
Sergei Petrunia
b6d4859547 MDEV-14181: rocksdb.rocksdb fails: line 1117: query 'reap' succeeded - should have failed
Fix a race condition in the testcase.
2017-10-29 11:39:52 +03:00
Sergei Petrunia
8abe840085 Merge branch 'bb-10.2-mariarocks' of github.com:MariaDB/server into 10.2 2017-10-29 11:27:24 +03:00
Vladislav Vaintroub
97df230aed MDEV-14115 : Do not use lpNumberOfBytesRead/Written params in
ReadFile/WriteFile operations.

Innodb opens files with FILE_FLAG_OVERLAPPED. lpNumberOfBytesRead/Written
are documented to be potentially inaccurate in this case,
(possibly even if async operations complete synchronously?)

The fix is to always pass NULL for the correspondng parameters,
as recommended by  MSDN. Read the actual counts with
GetQueuedCompletionStatus() or GetOverlappedResult().
2017-10-27 23:42:02 +00:00
Marko Mäkelä
067f83969c MDEV-14132 follow-up fix: Make os_file_get_size() thread-safe
os_file_get_size(): Use fstat() instead of calling lseek() 3 times.
In this way, concurrent calls to this function should not interfere
with each other.

Suggested by Vladislav Vaintroub.
2017-10-27 19:33:38 +03:00
Marko Mäkelä
9dfe84d5de Remove a bogus page_is_root() debug assertion on btr_create() failure
The predicate page_is_root() would not hold if btr_create() fails
before the root page is fully initialized. Move the debug assertion
from btr_free_root_invalidate() to its other caller,
btr_free_if_exists(). In that caller, we actually already checked
for page_is_root().
2017-10-27 19:01:24 +03:00