As reported in MDEV-11969 "there's no way to ditch knowledge" about some
domain that is no longer updated on a server. Besides being of annoyance to
clutter output in DBA console stale domains can prevent the slave
to connect the master as MDEV-12012 witnesses.
What domain is obsolete must be evaluated by the user (DBA) according
to whether the domain info is still relevant and will the domain ever
receive any update.
This patch introduces a method to discard obsolete gtid domains from
the server binlog state. The removal requires no event group from such
domain present in existing binlog files though. If there are any the
containing logs must be first PURGEd in order for
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
succeed. Otherwise the command returns an error.
The list of obsolete domains can be computed through
intersecting two sets - the earliest (first) binlog's Gtid_list
and the current value of @@global.gtid_binlog_state - and extracting
the domain id components from the intersection list items.
The new DELETE_DOMAIN_ID featured FLUSH continues to rotate binlog
omitting the deleted domains from the active binlog file's Gtid_list.
Notice though when the command is ineffective - that none of requested to delete
domain exists in the binlog state - rotation does not occur.
Obsolete domain deletion is not harmful for connected slaves as long
as master side binlog files *purge* is synchronized with FLUSH-DELETE_DOMAIN_ID.
The slaves must have the last event from purged files processed as usual,
in order not to bump later into requesting a gtid from a file which
was already gone.
While the command is not replicated (as ordinary FLUSH BINLOG LOGS is)
slaves, even though having extra domains, won't suffer from reconnection errors
thanks to master-slave gtid connection protocol allowing the master
to be ignorant about a gtid domain.
Should at failover such slave to be promoted into master role it may run
the ex-master's
FLUSH BINARY LOGS DELETE_DOMAIN_ID=(list-of-domains)
to clean its own binlog state.
NOTES.
suite/perfschema/r/start_server_low_digest.result
is re-recorded as consequence of internal parser codes changes.
1. remove erroneously committed *.orig
2. fix LZ4 detection on Mac OS X and FreeBSD. Cannot do
pkg_check_modules(LIBLZ4 liblz4)
find_library(LIBLZ4_LIBS ... )
because find_library(X) does not do anything if X is defined (documented),
and pkg_check_modules(Y) sets Y_LIBS to "" (undocumented!)
Some tests are skipped by checks in suite.pm. It is redundant to
have an sql-level run-time check in the .inc file itself.
In some cases it's not only redundant, but dangerous.
After one bug in 10.2 innodb.create_isl_with_direct failed
to start InnoDB, but the server started fine (just without InnoDB)
and instead of failing, the test was skipped by run-time check in
have_innodb.inc.
# Conflicts:
# mysql-test/include/not_embedded.inc
# mysql-test/r/change_user_notembedded.result
# mysql-test/suite.pm
# mysql-test/t/change_user_notembedded.test
Make differentiation between pullout for merge and pulout of outer field during exists2in transformation.
In last case the field was outer and so we can safely start from name resolution context of the SELECT where it was pulled.
Old behavior lead to inconsistence between list of tables and outer name resolution context (which skips one SELECT for merge purposes) which creates problem vor name resolution.
* created tests focusing in multi-master conflicts during cascading foreign key
processing
* in row0upd.cc, calling wsrep_row_ups_check_foreign_constraints only when
running in cluster
* in row0ins.cc fixed regression from MW-369, which caused crash with MW-402.test
* changed thd_binlog_format to return configured binlog format in wsrep execution,
regardless of binlogging setting (i.e. with or without binlogging)
* thd_binlog_format is used in innobase::write_row(), and would return confusing
result there when log_bin==OFF
It is possible for a stored procedure that has an error handler
that catches SQLEXCEPTION to call thd->clear_error() on a thd
that failed certification. And because the error is cleared,
wsrep patch proceeds with the normal path and may try to commit
statements that should actually abort.
This patch catches the situation where wsrep_conflict_state is
still set, but the thd's error has been cleared, and rolls back
the statement in such cases.
With a big buffer pool that contains many data pages,
DISCARD TABLESPACE took a long time, because it would scan the
entire buffer pool to remove any pages that belong to the tablespace.
With a large buffer pool, this would take a lot of time, especially
when the table-to-discard is empty.
The minimum amount of work that DISCARD TABLESPACE must do is to
remove the pages of the to-be-discarded table from the
buf_pool->flush_list because any writes to the data file must be
prevented before the file is deleted.
If DISCARD TABLESPACE does not evict the pages from the buffer pool,
then IMPORT TABLESPACE must do it, because we must prevent pre-DISCARD,
not-yet-evicted pages from being mistaken for pages of the imported
tablespace.
It would not be a useful fix to simply move the buffer pool scan to
the IMPORT TABLESPACE step. What we can do is to actively evict those
pages that could be mistaken for imported pages. In this way, when
importing a small table into a big buffer pool, the import should
still run relatively fast.
Import is bypassing the buffer pool when reading pages for the
adjustment phase. In the adjustment phase, if a page exists in
the buffer pool, we could replace it with the page from the imported
file. Unfortunately I did not get this to work properly, so instead
we will simply evict any matching page from the buffer pool.
buf_page_get_gen(): Implement BUF_EVICT_IF_IN_POOL, a new mode
where the requested page will be evicted if it is found. There
must be no unwritten changes for the page.
buf_remove_t: Remove. Instead, use trx!=NULL to signify that a write
to file is desired, and use a separate parameter bool drop_ahi.
buf_LRU_flush_or_remove_pages(), fil_delete_tablespace():
Replace buf_remove_t.
buf_LRU_remove_pages(), buf_LRU_remove_all_pages(): Remove.
PageConverter::m_mtr: A dummy mini-transaction buffer
PageConverter::PageConverter(): Complete the member initialization list.
PageConverter::operator()(): Evict any 'shadow' pages from the
buffer pool so that pre-existing (garbage) pages cannot be mistaken
for pages that exist in the being-imported file.
row_discard_tablespace(): Remove a bogus comment that seems to
refer to IMPORT TABLESPACE, not DISCARD TABLESPACE.
String comparison with utf8_bin collation is case sensitive.
Hence "DELIMITER" did not match with "delimiter".
The delimiter command matching now uses my_charset_latin1.
ibuf_check_bitmap_on_import(): Only access the pages that
are below FSP_FREE_LIMIT. It is possible that especially with
ROW_FORMAT=COMPRESSED, the FSP_SIZE will be much bigger than
the FSP_FREE_LIMIT, and the bitmap pages (page_size*N, 1+page_size*N)
are filled with zero bytes.
buf_page_is_corrupted(), buf_page_io_complete(): Make the
fault injection compatible with MariaDB 10.2.
Backport the IMPORT tests from 10.2.
On some old GNU/Linux systems, invoking posix_fallocate() with
offset=0 would sometimes cause already allocated bytes in the
data file to be overwritten.
Fix a correctness regression that was introduced in
commit 420798a81a
by invoking posix_fallocate() in a safer way.
A similar change was made in MDEV-5746 earlier.
os_file_get_size(): Avoid changing the state of the file handle,
by invoking fstat() instead of lseek().
os_file_set_size(): Determine the current size of the file
by os_file_get_size(), and then extend the file from that point
onwards.
os_file_set_size(): If posix_fallocate() returns EINVAL, fall back
to writing zero bytes to the file. Also, remove some error log output,
and make it possible for a server shutdown to interrupt the fall-back
code.
MariaDB used to ignore any possible return value from posix_fallocate()
ever since innodb_use_fallocate was introduced in MDEV-4338. If EINVAL
was returned, the file would not be extended.
Starting with MDEV-11520, MariaDB would treat EINVAL as a hard error.
Why is the EINVAL returned? The GNU posix_fallocate() function
would first try the fallocate() system call, which would return
-EOPNOTSUPP for many file systems (notably, not ext4). Then, it
would fall back to extending the file one block at a time by invoking
pwrite(fd, "", 1, offset) where offset is 1 less than a multiple of
the file block size. This would fail with EINVAL if the file is in
O_DIRECT mode, because O_DIRECT requires aligned operation.
- innodb_buffer_pool_dump_now_basic is modified to make sure it
really performs a dump and waits till it completion, to avoid
the apparent or hidden failure similar to MDEV-9713 / MDEV-10651
- innodb_buffer_pool_dump_pct_basic is modified to re-use the new
code from innodb_buffer_pool_dump_now_basic and thus avoid
the failure MDEV-10651
- innodb_buffer_pool_load_now_basic is re-written to simplify
the logic by re-using the code innodb_buffer_pool_dump_now_basic
and is given an opt file to avoid race conditions with
buffer pool load performed upon server startup, which causes
MDEV-14196 failure
The test would pass even with skipped partitioning, because
CHECK PARTITION for a view works identically with enabled/disabled
partitioning; but if the server is compiled without partitioning
at all, it cannot execute the statement, and the test would fail.
Check for the presence of partitioning allows to skip the test
in this case, rather than let it fail