1) Whenever purge thread tries to remove the secondary virtual index
entry, purge thread acquires metadata lock for the table and release
dict_operation_lock. After that, it retries the secondary index
deletion if MDL acquired successfully.
2) Inside row_vers_old_has_index_entry(), Change the safe_to_purge
to unsafe_to_purge goto statement. So it can be more appropriate to
return true if it is unsafe_to_purge.
3) Previously, row_vers_old_has_index_entry() returns false if InnoDB
fetched the MDL on the table for the first time. This check(two cases)
should checked only during purge thread. In row_purge_poss_sec(), again
InnoDB checks whether the MDL fetched for the first time. If it is then
InnoDB retry the secondary index deletion logic. So in that case,
InnoDB have to clean up the memory used inside row_vers_old_has_index_entry()
and shouldn't care about return value.
Valgrind started supporting CRC32 instruction starting with version
3.6.1, released in 2011. Thus remove the fallback to software
implementation in case running under Valgrind.
This is motivated by PS-5221 in
percona/percona-server@2817c561fc
The coarser-precision ut_time() will still refer to the
system clock, meaning that bad things can happen if the
real time clock is adjusted backwards.
There is one directly applicable change to InnoDB:
commit 739f5239f1 in the
5.5 branch will be merged before the next MariaDB releases.
Another potentially applicable change will be tracked
separately as MDEV-20126.
Thus, here we only update the InnoDB version number and do
not change anything else.
This is a regression due to MDEV-16515 that affects some versions in
the MariaDB 10.1 server series starting with 10.1.35, and possibly
all versions starting with 10.2.17, 10.3.8, and 10.4.0.
The idea of MDEV-16515 is to allow DROP TABLE to be interrupted,
in case it was stuck due to some concurrent activity. We already
made some cases of internal DROP TABLE immune to kill in MDEV-18237,
MDEV-16647, MDEV-17470. We must include the cleanup of
CREATE TABLE...SELECT in the list of such internal DROP TABLE.
ha_innobase::delete_table(): Pass create_failed=true if the current
SQL statement is CREATE, so that the table will be dropped.
row_drop_table_for_mysql(): If create_failed=true, do not allow
the operation to be interrupted.
This is the race between DELETE and INSERT (or other any two operations accessing to the table).
What should happen in good case:
1. ALTER TABLE is issued. vc_templ->default_rec is initialized with temporary share's default_fields
2. temporary share is freed, but datadict is still there, with garbage in vc_templ->default_rec
3. DELETE is issued. It is first after ALTER TABLE finished.
4. ha_innobase::open() is called, ib_table->get_ref_count() should be one
5. we reinitialize vc_templ, so no garbage anymore
What actually happens:
3. DELETE is issued.
4. ha_innobase::open() is called and ib_table->get_ref_count() is 1
5. INSERT (or SELECT etc.) is issued in parallel
6. ha_innobase::open() is called and ib_table->get_ref_count() is 1
7. we check ib_table->get_ref_count() and it is 2 in both threads when we want reinitialize vc_templ
8. garbage is there
Fix:
* Do not store pointers to SHARE memory in table dict, copy it instead.
* But then we don't need to refresh it each time when refcount=1.
Problem:
=======
Checksum fields can have value as zero. In that case, InnoDB falsely
consider that page should be all zeroes. It leads to wrong detection of page
corruption.
Solution:
========
Remove the condition that checks if checksum fields are zero then
page should be all zeroes.
Problem:
========
There is a possibility that there can be more concurrent DMLs While the
alter table thread is waiting for upgrading to MDL_EXCLUSIVE before commit phase.
In commit phase, InnoDB acquires dict_operation_lock and it already holds MDL_EXCLUSIVE
on the table. After that, InnoDB applies the concurrent DML logs in commit phase.
This could lead to blocking of the following things:
1) DML on the particular table (due to MDL_EXCLUSIVE on the table)
2) InnoDB DDLs (due to dict_operation_lock)
3) Purge thread, stats thread, the master thread (due to dict_operation_lock)
Fix:
====
Apply the concurrent DML logs in commit phase but before acquiring
dict_operation_lock in commit phase. It makes sure that (2), (3) can't be
blocked for longer time.
Basic idea of the patch: disallow creating tables which allow to create
rows which are too big to insert. In other words, if user created a table user
should never see an errors like 'can not insert row as it is too big for current
page size'.
SET innodb_strict_mode=OFF; will allow to create very long tables and only a
warning will be issued.
dict_table_t::get_overflow_field_local_len(): this function lets know a maximum
local field len for overflow fields for every file and row format.
innobase_check_column_length(): improve name to too_big_key_part_length()
and reuse in a different part of code.
create_table_info_t::prepare_create_table(): add check for maximum allowed
key part length to keep ALGORITHM=COPY behavior similar to ALGORITHM=INPLACE
behavior. Affected test is innodb.strict_mode
Rename dict_index_too_big_for_tree() to
dict_index_t::rec_potentially_too_big(): copy overflow-related size computation
from dtuple_convert_big_rec(). A lot of tests was changed because of that.
I wonder whether users will complain about it?
Test innodb.max_record_size tests dict_index_t::rec_potentially_too_big()
for different row formats and page sizes.
In row_ins_foreign_check_on_constraint(), clustered index record is being passed to wsrep_append_foreign_key() after releasing the latch. If a record has been changed by other thread in the meantime then it could lead to a crash when
wsrep_rec_get_foreign_key () tries to access the record.
row_ins_foreign_check_on_constraint
Use cascade->pcur->old_rec instead of clust_rec.
row_ins_check_foreign_constraint
Add missing error printout.
The test innodb.leaf_page_corrupted_during_recovery
fails on buildbot with
Warning 1406 Data too long for column 'line' at row 10
line
len 16384; hex ...
because of a page dumps that InnoDB is generating for a corrupted page
Since this test is using debug instrumentation, we will solve the
issue by disabling page dumps in debug builds altogether. Users of
debug builds will likely know how to extract page dumps in other means.
Page dump output could sometimes be useful when diagnosing problems
that users are facing. Hence we will keep the page dump output in
non-debug (release) builds.
- Ported mysql Bug#20597981 test case to mariadb-10.2
- InnoDB never used fts_doc_id_in_read_set. Basically it tells
innodb to read the fts_doc_id from the index record itself.
- Introduce a new variable called innodb_encrypt_temporary_tables which is
a boolean variable. It decides whether to encrypt the temporary tablespace.
- Encrypts the temporary tablespace based on full checksum format.
- Introduced a new counter to track encrypted and decrypted temporary
tablespace pages.
- Warnings issued if temporary table creation has conflict value with
innodb_encrypt_temporary_tables
- Added a new test case which reads and writes the pages from/to temporary
tablespace.
Added the condition in innochecksum tool to check page id mismatch.
This could catch the write corruption caused by InnoDB.
Added the debug insert inside fil_io() to check whether it writes
the page to wrong offset.
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Debug
Maintainer mode makes all warnings errors. This patch fix warnings. Mostly about
deprecated `register` keyword.
Too much warnings came from Mroonga and I gave up on it.
Problem:
=========
One of the purge thread access the corrupted page and tries to remove from
LRU list. In the mean time, other purge threads are waiting for same page
in buf_wait_for_read(). Assertion(buf_fix_count == 0) fails for the
purge thread which tries to remove the page from LRU list.
Solution:
========
- Set the page id as FIL_NULL to indicate the page is corrupted before
removing the block from LRU list. Acquire hash lock for the particular
page id and wait for the other threads to release buf_fix_count
for the block.
- Added the error check for btr_cur_open() in row_search_on_row_ref().
Explicitly mention every options in .clang-format to protect us from possible
future changes.
Remove separate InnoDB style.
Change style to look more like this script:
for x in $@
do
indent -kr -bl -bli0 -l79 -i2 -nut -c48 -dj -cp0 $x
sed -ri -e 's/ = /= /g'\
-e '/switch.*\)$/{N;s/\n[ ]+/ /}' $x
done
Significant different is that 'switch' and '{' are put on different lines
because it's impossible in clang-format to set formatting rules just for
'switch' statement.
GCC 9.1.1 noticed that sd_notifyf() was always being invoked with
str=NULL argument for "%s". This code was added in
commit 2e814d4702
but not mentioned in the commit comment.
The STATUS messages for systemd matter during startup and shutdown,
and should not be emitted during normal operation.
ib_senderrf(): Remove the potentially harmful sd_notifyf() calls.
- If InnoDB encounters garbage or incomplete written log block during
recovery then don't throw the error. Treat it as end of the log.
- This kind of incomplete or empty block can be result of killing
InnoDB when writing the redo log.
Some I/O functions and macros that are declared in os0file.h used to
return a Boolean status code (nonzero on success). In MySQL 5.7, they
were changed to return dberr_t instead. Alas, in MariaDB Server 10.2,
some uses of functions were not adjusted to the changed return value.
Until MDEV-19231, the valid values of dberr_t were always nonzero.
This means that some code that was incorrectly checking for a zero
return value from the functions would never detect a failure.
After MDEV-19231, some tests for ALTER ONLINE TABLE would fail with
cmake -DPLUGIN_PERFSCHEMA=NO. It turned out that the wrappers
pfs_os_file_read_no_error_handling_int_fd_func() and
pfs_os_file_write_int_fd_func() were wrongly returning
bool instead of dberr_t. Also the callers of these functions were
wrongly expecting bool (nonzero on success) instead of dberr_t.
This mistake had been made when the addition of these functions was
merged from MySQL 5.6.36 and 5.7.18 into MariaDB Server 10.2.7.
This fix also reverts commit 40becbc3c7
which attempted to work around the problem.
Problem:
=======
fil_iterate() writes imported tablespace page0 as it is to discarded
tablespace. Space id wasn't even changed. While opening the tablespace,
tablespace fails with space id mismatch error.
Fix:
====
fil_iterate() copies the page0 with discarded space id to imported
tablespace.