Commit graph

183813 commits

Author SHA1 Message Date
Marko Mäkelä
7c5519c12d MDEV-22387: Do not violate __attribute__((nonnull))
Passing a null pointer to a nonnull argument is not only undefined
behaviour, but it also grants the compiler the permission to optimize
away further checks whether the pointer is null. GCC -O2 at least
starting with version 8 may do that, potentially causing SIGSEGV.
2020-09-23 12:47:49 +03:00
Marko Mäkelä
70960bd33d UBSAN: Fix a bit shift overflow
Shifting a 16-bit type by 16 bits is undefined behaviour.
The result is at least 32 bits, so let us cast the shift operand
to a wider type before shifting.
2020-09-23 12:42:30 +03:00
Marko Mäkelä
af40a2b43e Fix GCC 10.2.0 -Og -fsanitize=undefined -Wmaybe-uninitialized
For some reason, adding -fsanitize=undefined (cmake -DWITH_UBSAN=ON)
to the compilation flags will cause even more warnings to be emitted.
The warnings do look bogus, but the code can be simplified.
2020-09-23 12:27:56 +03:00
Marko Mäkelä
0448558a0d Fix GCC 10.2.0 -Og -fsanitize=undefined -Wformat-overflow
For some reason, adding -fsanitize=undefined (cmake -DWITH_UBSAN=ON)
to the compilation flags will cause even more warnings to be emitted.
The warning was a bogus one:

tests/mysql_client_test.c:8632:22: error: '%d' directive writing between
1 and 11 bytes into a region of size 9 [-Werror=format-overflow=]
 8632 |     sprintf(field, "c%d int", i);
      |                      ^~
tests/mysql_client_test.c:8632:20: note: directive argument
in the range [-2147483648, 999]

The warning does not take into account that the lower bound of the
variable actually is 0. But, we can help the compiler and use an
unsigned variable.
2020-09-23 12:14:05 +03:00
Oleksandr Byelkin
594c11fffc Ukrainian error text translations added. 2020-09-22 15:13:17 +02:00
Marko Mäkelä
9d0ee2dcb7 Merge 10.1 into 10.2 2020-09-22 15:21:43 +03:00
Marko Mäkelä
78efa10930 MDEV-22939: Restore an AUTO_INCREMENT check
It turns out that we must check for DISCARD TABLESPACE both
when the table is being rebuilt and when the AUTO_INCREMENT
value of the table is being added.

This was caught by the test innodb.alter_missing_tablespace.
Somehow I failed to run all tests. Sorry!
2020-09-22 13:55:36 +03:00
Marko Mäkelä
3eb81136e1 MDEV-22939 Server crashes in row_make_new_pathname()
The statement ALTER TABLE...DISCARD TABLESPACE is problematic,
because its designed purpose is to break the referential integrity
of the data dictionary and make a table point to nowhere.

ha_innobase::commit_inplace_alter_table(): Check whether the
table has been discarded. (This is a bit late to check it, right
before committing the change.) Previously, we performed this check
only in a specific branch of the function commit_set_autoinc().

Note: We intentionally allow non-rebuilding ALTER TABLE even if
the tablespace has been discarded, to remain compatible with MySQL.
(See the various tests with "wl5522" in the name, such as
innodb.innodb-wl5522.)

The test case would crash starting with 10.3 only, but it does not hurt
to minimize the code and test difference between 10.2 and 10.3.
2020-09-22 13:40:05 +03:00
Marko Mäkelä
e5e83daf32 Make DISCARD TABLESPACE more robust
dict_load_table_low(): Copy the 'discarded' flag to file_unreadable.
This allows to avoid a potentially harmful call to dict_stats_init()
in ha_innobase::open().
2020-09-22 13:09:51 +03:00
Marko Mäkelä
2af8f712de MDEV-23776: Re-apply the fix and make the test more robust
The test that was added in commit e05650e686
would break a subsequent run of a test encryption.innodb-bad-key-change
because some pages in the system tablespace would be encrypted with
a different key.

The failure was repeatable with the following invocation:

./mtr --no-reorder \
encryption.create_or_replace,cbc \
encryption.innodb-bad-key-change,cbc

Because the crash was unrelated to the code changes that we reverted
in commit eb38b1f703
we can safely re-apply those fixes.
2020-09-22 13:08:09 +03:00
Marko Mäkelä
732cd7fd53 MDEV-23705 Assertion 'table->data_dir_path || !space'
After DISCARD TABLESPACE, the tablespace of a table will no longer
exist, and dict_get_and_save_data_dir_path() would invoke
dict_get_first_path() to read an entry from SYS_DATAFILES.
For some reason, DISCARD TABLESPACE would not to remove the entry
from there.

dict_get_and_save_data_dir_path(): If the tablespace has been
discarded, do not bother trying to read the name.

Side note: The tables SYS_TABLESPACES and SYS_DATAFILES are
redundant and subject to removal in MDEV-22343.
2020-09-22 11:13:51 +03:00
Marko Mäkelä
eb38b1f703 Revert "MDEV-23776 Test encryption.create_or_replace fails with a warning"
This reverts commit e33f7b6faa.
The change seems to have introduced intermittent failures of the test
encryption.innodb-bad-key-change on many platforms.

The failure that we were trying to address was not reproduced on 10.2.
It could be related to commit a7dd7c8993
(MDEV-23651) or de942c9f61 (MDEV-15983)
or other changes that reduced contention on fil_system.mutex in 10.3.

The fix that we are hereby reverting from 10.2 seems to work fine
on 10.3 and 10.4.
2020-09-22 10:41:06 +03:00
Daniel Black
4c19227929 systemd: mariadb@bootstrap - clear ExecStartPre and ExecStartPost
This is just to make sure no ExecStartPre/Post actions from the
multi-instance MariaDB service definition are executed
when a user attempts to start mariadb@bootstrap.

Fixes: 3723c70a30
2020-09-22 15:37:44 +10:00
Marko Mäkelä
e05650e686 MDEV-23776: Add the reduced encryption.create_or_replace test
This was forgotten in commit a9d8f5c1a5.
2020-09-21 16:35:28 +03:00
Marko Mäkelä
407d170c92 MDEV-23711 fixup: GCC -Og -Wmaybe-uninitialized 2020-09-21 16:29:29 +03:00
Marko Mäkelä
a9d8f5c1a5 MDEV-23776: Split encryption.create_or_replace
Let us shrink the test encryption.create_or_replace so that it can
run on the CI system, also on the embedded server.

encryption.create_or_replace_big: Renamed from the original test,
with the subset of encryption.create_or_replace omitted.
2020-09-21 16:14:35 +03:00
Marko Mäkelä
e33f7b6faa MDEV-23776 Test encryption.create_or_replace fails with a warning
The test encryption.create_or_replace would occasionally fail with
a warning message from fil_check_pending_ops().

fil_crypt_find_space_to_rotate(): While waiting for available
I/O capacity, check fil_space_t::is_stopping() and release a
handle if necessary.

fil_space_crypt_close_tablespace(): Wake up the waiters in
fil_crypt_find_space_to_rotate().
2020-09-21 15:55:27 +03:00
Jan Lindström
a3bdce8f1e MDEV-23659 : Update Galera disabled.def file 2020-09-21 14:01:56 +03:00
Jan Lindström
a0e2a293bc Fix try. 2020-09-21 13:59:13 +03:00
Vlad Lesin
6c4c88dbb8 MDEV-18867: Remove an orphan function 2020-09-21 12:52:44 +03:00
Vlad Lesin
0a224edc3e MDEV-23711 make mariabackup innodb redo log read error message more clear
log_group_read_log_seg() returns error when:

1) Calculated log block number does not correspond to read log block
number. This can be caused by:
  a) Garbage or an incompletely written log block. We can exclude this
  case by checking log block checksum if it's enabled(see innodb-log-checksums,
  encrypted log block contains checksum always).
  b) The log block is overwritten. In this case checksum will be correct and
  read log block number will be greater then requested one.

2) When log block length is wrong. In this case recv_sys->found_corrupt_log
is set.

3) When redo log block checksum is wrong. In this case innodb code
writes messages to error log with the following prefix: "Invalid log
block checksum."

The fix processes all the cases above.
2020-09-21 12:29:52 +03:00
Jan Lindström
69d536a22d MDEV-23751 : galera_3nodes test failures on ipv6 sst tests
Fix assertion text it was too tight for some systems.

This is backport from 10.4 and for Galera 3.
2020-09-18 14:09:24 +03:00
Jan Lindström
f381e019b6 MDEV-23574 : galera_3nodes.galera_ipv6_mariabackup_section MTR failed: Could not open '../galera/include/have_mariabackup.inc'
Test case and configuration cleanup.
2020-09-17 12:55:06 +03:00
Jan Lindström
e3e657373a MDEV-21769 : galera_3nodes.galera_safe_to_bootstrap fails
Add wait_condition to wait correct cluster configuration.
2020-09-17 08:25:07 +03:00
Jan Lindström
96426dac91 MDEV-21655 : galera.galera_wan_restart_ist MTR fails sporadically: WSREP did not transition to state READY
Replace sleeps with proper wait_conditions to wait correct
cluster configuration.
2020-09-16 15:23:41 +03:00
Sujatha
873cc1e77a MDEV-21839: Handle crazy offset to SHOW BINLOG EVENTS
Problem:
=======
SHOW BINLOG EVENTS FROM <"random"-pos> caused a variety of failures as
reported in MDEV-18046. They are fixed but that approach is not future-proof
as well as is not optimal to create extra check for being constructed event
parameters.

Analysis:
=========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that, read
event data, satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.

For example: https://jira.mariadb.org/browse/MDEV-18046
In the bug description user executes CREATE TABLE/INSERT and ALTER SQL
commands.

When a crazy offset like "SHOW BINLOG EVENTS FROM 365" is provided code
assumes offset 365 as valid event begin and proceeds to EVENT_LEN_OFFSET reads
some random length and comes up with a crazy event which didn't exits in the
binary log. In this quoted example scenario, event read at offset 365 is
considered as "Update_rows_log_event", which is not present in binary log.
Since this is a random event its validation fails and code results in
assert/segmentation fault, as shown below.

mysqld: /data/src/10.4/sql/log_event.cc:10863: Rows_log_event::Rows_log_event(
    const char*, uint, const Format_description_log_event*):
    Assertion `var_header_len >= 2' failed.
    181220 15:27:02 [ERROR] mysqld got signal 6 ;
#7  0x00007fa0d96abee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x000055e744ef82de in Rows_log_event::Rows_log_event (this=0x7fa05800d390,
    buf=0x7fa05800d080 "", event_len=254, description_event=0x7fa058006d60) at
/data/src/10.4/sql/log_event.cc:10863
#9  0x000055e744f00cf8 in Update_rows_log_event::Update_rows_log_event

Since we are reading random data repeating the same command SHOW BINLOG EVENTS
FROM 365 produces different types of crashes with different events. MDEV-18046
reported 10 such crashes.

In order to avoid such scenarios user provided starting offset needs to be
validated for its correctness. Best way of doing this is to make use of
checksums if they are available. MDEV-18046 fix introduced the checksum based
validation.

The issue still remains in cases where binlog checksums are disabled. Please
find the following bug reports.

MDEV-22473: binlog.binlog_show_binlog_event_random_pos failed in buildbot,
            server crashed in read_log_event
MDEV-22455: Server crashes in Table_map_log_event,
            binlog.binlog_invalid_read_in_rotate failed in buildbot

Fix:
====
When binlog checksum is disabled, perform scan(via reading event by event), to
validate the requested FROM <pos> offset. Starting from offset 4 read the
event_length of next_event in the binary log. Using the next_event length
advance current offset to point to next event. Repeat this process till the
current offset is less than or equal to crazy offset. If current offset is
higher than crazy offset provide appropriate invalid input offset error.
2020-09-16 14:03:32 +05:30
Daniel Black
5768f57d24 mtr: main.mysql_upgrade - record after correcting error message 2020-09-14 21:19:56 +10:00
Vlad Lesin
80075ba011 MDEV-19264 Better support MariaDB GTID for Mariabackup's --slave-info option
Parse SHOW SLAVE STATUS output for the "Using_Gtid" column. If the value
is "No", then old log file and position is backed up, otherwise gtid_slave_pos
is backed up.
2020-09-14 11:14:50 +03:00
Nikita Malyavin
ae8ff3a067 MDEV-20396 Server crashes after DELETE with SEL NULL Foreign key and a
virtual column in index

Problem:
row_ins_foreign_fill_virtual was unconditionally set virtual fields to NULL
even though the field is not a part of a foreign key
(but a part of an index)

Solution:
The new virtual value should be computed with regard to cascade updates.
2020-09-14 18:08:56 +10:00
Daniel Black
269f9c948c mysql_upgrade: fix error text 2020-09-12 09:37:40 +10:00
Sachin
c3a3b4596c MDEV-17438 rpl.show_status_stop_slave_race-7126 fails with timeout on Windows
Problem:- Test case uses socket which does not work on windows, So mysqlslap
falls back to default connection which is defined in my.cnf and connects
to master instead of slave. Since start slave/stop slave is executed on
master we get error MASTER_HOST was not set

Solution:- Use Port instead of socket
2020-09-11 15:35:32 +01:00
Jan Lindström
bc2dbdb601 MDEV-23587 : galera_3nodes.galera_var_dirty_reads2 MTR failed: 1047: WSREP has not yet prepared node for application use
Add wait_condition tomake sure insert is replicated and server
is after isolation back on ready state.
2020-09-11 14:49:42 +03:00
Thirunarayanan Balathandayuthapani
bfd1ed5a13 MDEV-18867 Long Time to Stop and Start
- Addressing ASAN failure in fts_check_aux_table()
2020-09-11 12:24:29 +05:30
Jan Lindström
224c950462 MDEV-23101 : SIGSEGV in lock_rec_unlock() when Galera is enabled
Remove incorrect BF (brute force) handling from lock_rec_has_to_wait_in_queue
and move condition to correct callers. Add a function to report
BF lock waits and assert if incorrect BF-BF lock wait happens.

wsrep_report_bf_lock_wait
	Add a new function to report BF lock wait.

wsrep_assert_no_bf_bf_wait
	Add a new function to check do we have a
	BF-BF wait and if we have report this case
	and assert as it is a bug.

lock_rec_has_to_wait
	Use new wsrep_assert_bf_wait to check BF-BF wait.

lock_rec_create_low
lock_table_create
	Use new function to report BF lock waits.

lock_rec_insert_by_trx_age
lock_grant_and_move_on_page
lock_grant_and_move_on_rec
	Assert that trx is not Galera as VATS is not compatible
	with Galera.

lock_rec_add_to_queue
	If there is conflicting lock in a queue make sure that
	transaction is BF.

lock_rec_has_to_wait_in_queue
	Remove incorrect BF handling. If there is conflicting
	locks in a queue all transactions must wait.

lock_rec_dequeue_from_page
lock_rec_unlock
	If there is conflicting lock make sure it is not
	BF-BF case.

lock_rec_queue_validate
	Add Galera record locking rules comment and use
	new function to report BF lock waits.

All attempts to reproduce the original assertion have been
failed. Therefore, there is no test case on this commit.
2020-09-10 13:18:12 +03:00
Thirunarayanan Balathandayuthapani
75e82f71f1 MDEV-18867 Long Time to Stop and Start
fts_drop_orphaned_tables() takes long time to remove the orphaned
FTS tables. In order to reduce the time, do the following:

- Traverse fil_system.space_list and construct a set of
table_id,index_id of all FTS_*.ibd tablespaces.
- Traverse the sys_indexes table and ignore the entry
from the above collection if it exist.
- Existing elements in the collection can be considered as
orphaned fts tables. construct the table name from
(table_id,index_id) and invoke fts_drop_tables().
- Removed DICT_TF2_FTS_AUX_HEX_NAME flag usage from upgrade.
- is_aux_table() in dict_table_t to check whether the given name
is fts auxiliary table
fts_space_set_t is a structure to store set of parent table id
and index id
- Remove unused FTS function in fts0fts.cc
- Remove the fulltext index in row_format_redundant test case.
Because it deals with the condition that SYS_TABLES does have
corrupted entry and valid entry exist in SYS_INDEXES.
2020-09-10 14:10:26 +05:30
Jan Lindström
5c07ce406b MDEV-23706 : Galera test failure on galera_autoinc_sst_mariabackup
Remove infinite procedure and use direct INSERTs.
2020-09-09 16:03:15 +03:00
Marko Mäkelä
0eb38243ce MDEV-23456 fixup: Simplify a comparison 2020-09-09 13:04:11 +03:00
Marko Mäkelä
040ae4c59b MDEV-22924 fixup: Replace C++11 auto 2020-09-09 13:02:25 +03:00
Marko Mäkelä
d44c0f46c5 MDEV-22924 fixup: Replace C++11 nullptr
Only starting with MariaDB Server 10.4 we may depend on C++11.
2020-09-09 12:26:51 +03:00
Marko Mäkelä
64c8fa58a8 MDEV-23685 SIGSEGV on ADD FOREIGN KEY after failed ADD KEY
dict_foreign_qualify_index(): Reject corrupted or garbage indexes.
For index stubs that are created on virtual columns, no
dict_field_t::col would be assign. Instead, the entire table
definition would be reloaded on a successful operation.
2020-09-09 12:04:42 +03:00
Marko Mäkelä
c26eae0cc0 MDEV-23456 fixup: Fix mtr_t::get_fix_count()
Before commit 05fa4558e0 (MDEV-22110)
we have slot->type == MTR_MEMO_MODIFY that are unrelated to
incrementing the buffer-fix count.

FindBlock::operator(): In debug builds, skip MTR_MEMO_MODIFY entries.

Also, simplify the code a little.

This fixes an infinite loop in the tests
innodb.innodb_defragment and innodb.innodb_wl6326_big.
2020-09-09 12:01:03 +03:00
Thirunarayanan Balathandayuthapani
b1009ae5c1 MDEV-23456 fil_space_crypt_t::write_page0() is accessing an uninitialized page
buf_page_create() is invoked when page is initialized. So that
previous contents of the page ignored. In few cases, it calls
buf_page_get_gen() is called to fetch the page from buffer pool.
It should take x-latch on the page. If other thread uses the block
or block io state is different from BUF_IO_NONE then release the
mutex and check the state and buffer fix count again. For compressed
page, use the existing free block from LRU list to create new page.
Retry to fetch the compressed page if it is in flush list

fseg_create(), fseg_create_general(): Introduce block as a parameter
where segment header is placed. It is used to avoid repetitive
x-latch on the same page

Change the assert to check whether the page has SX latch and
X latch in all callee function of buf_page_create()

mtr_t::get_fix_count(): Get the buffer fix count of the given
block added by the mtr

FindBlock is added to find the buffer fix count of the given
block acquired by the mini-transaction
2020-09-09 11:58:15 +05:30
Marko Mäkelä
f99cace77f MDEV-22924 Corruption in MVCC read via secondary index
An unsafe optimization was introduced by
commit 2347ffd843 (MDEV-20301)
which is based on
mysql/mysql-server@3f3136188f or
mysql/mysql-server@647a3814a9
in MySQL 8.0.12 or MySQL 8.0.13
(which in turn is based on the contribution in MySQL Bug #84958).

Row_sel_get_clust_rec_for_mysql::operator(): In addition to checking
that the pointer to the record matches, also check the latest
modification of the page (FIL_PAGE_LSN) as well as the page identifier.
Only if all three match, it is safe to reuse cached_old_vers.

Row_sel_get_clust_rec_for_mysql::check_eq(): Assert that the PRIMARY KEY
of the cached old version of the record corresponds to the latest version.

We got a test case where CHECK TABLE, UPDATE and purge would be
hammering on the same table (with only 6 rows) and a pointer that
was originally pointing to a record pk=2 would match a cached_clust_rec
that was pointing to a record pk=1. In the diagnosed `rr replay` trace,
we would wrongly return an old cached version of the pk=1 record,
instead of retrieving the correct version of the pk=2 record. Because
of this, CHECK TABLE would fail to count one of the records in a
secondary index, and report failure.

This bug appears to affect MVCC reads via secondary indexes only.
The purge of history in secondary indexes uses a different code path,
and so do checks for implicit record locks.
2020-09-07 15:31:54 +03:00
Sujatha
a8f6bbb7a8 MDEV-9501: rpl.rpl_binlog_index, rpl.rpl_gtid_crash, rpl.rpl_stm_multi_query fail sporadically in buildbot with Master command COM_REGISTER_SLAVE failed
Analysis:
========
Slave server will send COM_REGISTER_SLAVE command at the time of establishing
a connection to master. If master is down, then the command will fail and
COM_REGISTER_SLAVE failed warning is reported.

'rpl_binlog_index.test' shutsdown the master and it relocates binary logs to a
new location and attempts to start master by pointing 'log-bin' to new
location. During this process the slave threads are active. IO thread actively
checks for the presence of master when it finds that the connection is lost it
attempts a reconnect, as master is down COM_REGISTER_SLAVE command fails.

As part of fix, stop the slave threads and then shutdown the master and do the
binlog relocation. Once master is restarted start the slave threads and sync
them with the master. In test binary logs and index files on master are
relocated to /tmpdir but during master restart only --log-bin option is
provided, this is incorrect. Even --log-bin-index also should be pointed to
/tmpdir otherwise upon master server restart two index files will be created.
One master-bin.index in /tmpdir and a new master-bin.index as per log_basename
in datadir. Due to this slave will fail to connect to master.

'rpl_gtid_crash.test' tests following scenario "crashing master, causing slave
IO thread to reconnect while SQL thread is running". When IO thread tries to
connect to crashed master on slow platforms COM_REGISTER_SLAVE command fails.
This is expected hence the warning should be added to suppression list.
2020-09-07 17:23:45 +05:30
Kentoku SHIBA
420c4dcc7e MDEV-7098 spider/bg.spider_fixes failed in buildbot with safe_mutex: Trying to unlock mutex conn->mta_conn_mutex that wasn't locked at storage/spider/spd_db_conn.cc, line 671 2020-09-07 10:26:23 +09:00
Kentoku SHIBA
9dedba16ab MDEV-7098 spider/bg.spider_fixes failed in buildbot with safe_mutex: Trying to unlock mutex conn->mta_conn_mutex that wasn't locked at storage/spider/spd_db_conn.cc, line 671 2020-09-07 10:18:43 +09:00
xdavidwu
3ee2422624 InnoDB, XtraDB: handle EOPNOTSUPP from posix_fallocate()
On some libc (like musl[1]), posix_fallocate() is a fallocate() syscall
wrapper, and does not include fallback code like glibc does. In that
case, EOPNOTSUPP is returned if underlying filesystem does not
support fallocate() with mode = 0.

This patch enables falling back to writing zeros when EOPNOTSUPP, fixes
some cases like running on filesystem without proper fallocate support
on Alpine.

[1]: https://git.musl-libc.org/cgit/musl/tree/src/fcntl/posix_fallocate.c?h=v1.2.1
2020-09-04 13:07:45 +03:00
Marko Mäkelä
c029d45623 MDEV-23600 follow-up: uninitialized rec_field_is_prefix
build_template_field(): Initialize templ->rec_field_is_prefix
also for indexes on virtual columns. This was caught on 10.5 by
MemorySanitizer as use-of-uninitialized-value in
row_search_with_covering_prefix() when running the test
main.fast_prefix_index_fetch_innodb.
2020-09-04 12:14:24 +03:00
Sergei Petrunia
8c2909a2a4 Fix a typo in the previous cset 2020-09-04 09:12:27 +00:00
Sergei Petrunia
d63fcbc2f0 MDEV-23661: RocksDB produces "missing initializer for member" warnings
Add -Wno-missing-field-initializers for MyRocks and gcc version below 5.0
2020-09-03 15:26:55 +00:00