Analysis: In case of error while processing json document, we goto
error label which eventually return 1 instead of 0.
Fix: Return 0 in case of error instead of 1.
1) When at least one of the two json documents is of scalar type:
1.a) If value and json document both are scalar, then return true
if they have same type and value.
1.b) If json document is scalar but other is array (or vice versa),
then return true if array has at least one element of same type
and value as scalar.
1.c) If one is scalar and other is object, then return false because
it can't be compared.
2) When both arguments are of non-scalar type and below conditons
are satisfied then return true:
2.a) When both arguments are arrays:
Iterate over the value and json document. If there exists at
least one element in other array of same type and value as
that of element in value.
2.b) If both arguments are objects:
Iterate over value and json document and if there exists at least
one key-value pair common between two objects.
2.c) If either of json document or value is array and other is object:
Iterate over the array, if an element of type object is found,
then compare it with the object (which is the other arguemnt).
If the entire object matches i.e all they key value pairs match.
The comparison on the checkpoint age (number of log bytes
written since the previous checkpoint) is inaccurate, because
the previous FILE_CHECKPOINT record could span two 512-byte
log blocks, which will cause the LSN to increase by the size of the
log block header and footer.
We will still generate a redudant checkpoint if the previous
checkpoint wrote some FILE_MODIFY records before the FILE_CHECKPOINT
record.
Whenever we retrieve an older version for READ COMMITTED,
it is better to release the undo page latches
so that we can freely move to the next clustered index record
without potentially violating any latching order.
Only one checkpoint may be in progress at a time.
The counter log_sys.n_pending_checkpoint_writes
was being protected by log_sys.mutex.
Let us replace it with the Boolean log_sys.checkpoint_pending.
srv_start(): Set srv_startup_is_before_trx_rollback_phase before
starting the buf_flush_page_cleaner() thread, so that it will not
invoke log_checkpoint() before the log file has been created.
This race condition was reproduced with https://rr-project.org.
This fixes up commit 15efb7ed48
buf_pool_t::watch_unset(): Reorder some code so that
no warning will be emitted in CMAKE_BUILD_TYPE=RelWithDebInfo.
It is unclear why invoking watch_is_sentinel() before
buf_fix_count() would make the warning disappear.
For GTID consistenty, GTID events was artificialy added before
replication happned. This event should not contain CHECKSUM calculated.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
On affected machine, the error happens sporadically in
innodb.instant_alter_limit.
Procmon shows SetRenameInformationFile failing with ERROR_ACCESS_DENIED.
In this case, the destination file was previously opened rsp oplocked by
Windows defender antivirus.
The fix is to retry MoveFileEx on ERROR_ACCESS_DENIED.
In commit 437da7bc54 (MDEV-19534),
the default value of the global variable srv_checksum_algorithm
in innochecksum was changed from SRV_CHECKSUM_ALGORITHM_INNODB
to implied 0 (innodb_checksum_algorithm=crc32). As a result,
the function buf_page_is_corrupted() would by default invoke
buf_calc_page_crc32() in innochecksum, and crc32_inited would hold.
This would cause "innochecksum" to fail on a particular page.
The actual problem is older, introduced in 2011 in
mysql/mysql-server@17e497bdb7
(MySQL 5.6.3). It should affect the validation of pages of old
data files that were written with innodb_checksum_algorithm=innodb.
When using innodb_checksum_algorithm=crc32 (the default setting
since MariaDB Server 10.2), some valid pages would be rejected
only because exactly one of the two checksum fields accidentally
matches the innodb_checksum_algorithm=crc32 value.
buf_page_is_corrupted(): Simplify the logic of non-strict
checksum validation, by always invoking buf_calc_page_crc32().
Remove a bogus condition that if only one of the checksum fields
contains the value returned by buf_calc_page_crc32(), the page
is corrupted.
In commit 7a4fbb55b0 (MDEV-25105)
the innochecksum option --write (-w) was removed altogether.
It should have been made a Boolean option, so that old data files
may be converted to a format that is compatible with
innodb_checksum_algorithm=strict_crc32 by executing the following:
innochecksum -n -w ibdata* */*.ibd
It would be better to use an older-version innochecksum
for such a conversion, so that page checksums will be validated
before updating the checksum.
It never was possible for innochecksum to convert files to the
innodb_checksum_algorithm=full_crc32 format that is the default
for new InnoDB data files.
This also fixes MDEV-20198: Instant ALTER TABLE is not crash safe
InnoDB dictionary recovery wrongly used the READ UNCOMMITTED isolation
level, causing some mismatch. For example, if a table was renamed or
replaced in a transaction, according to READ UNCOMMITTED the table might
not exist at all.
We implement READ COMMITTED isolation level for accessing the dictionary
tables SYS_TABLES, SYS_COLUMNS, SYS_INDEXES, SYS_FIELDS, SYS_VIRTUAL,
SYS_FOREIGN, SYS_FOREIGN_COLS. For most of these tables, no secondary
index exists. For the secondary indexes (on SYS_TABLES.ID,
SYS_FOREIGN.FOR_NAME, SYS_FOREIGN.REF_NAME), we will always look up
the primary key in the clustered index and check if the record actually
is a committed version.
dict_check_sys_tables(): Recover tablespaces also from delete-marked
committed records, so that if a matching .ibd file exists, it will
be removed by fil_delete_tablespace() when the committed delete-marked
SYS_INDEXES record of the clustered index is purged
in row_purge_remove_clust_if_poss_low().
fil_ibd_open(): Change the Boolean parameter "validate" to a ternary
one, to suppress error messages when the file might not exist.
It is possible that a .ibd file was deleted and the server shut down
before the SYS_INDEXES and SYS_TABLES records were purged. Hence, if
dict_check_sys_tables() finds a committed delete-marked record,
we must not complain if the tablespace file is not found.
On Windows, we msut treat ERROR_PATH_NOT_FOUND (directory not found)
in the same way as ERROR_FILE_NOT_FOUND. This fixes a few failures where
a previous test successfully executed DROP DATABASE (and deleted all
files and the directory), but a committed delete-marked SYS_TABLES
record had not been purged before server restart.
dict_getnext_system_low(): Do not filter out delete-marked records.
dict_startscan_system(), dict_getnext_system(): Do filter out
delete-marked records, for accessing the INFORMATION_SCHEMA tables.
dict_sys_tables_rec_read(): Return the DB_TRX_ID of the committed
version of the record. This is needed in dict_load_table_low().
dict_load_foreign_cols(), dict_load_foreign(): Add a parameter for
the current transaction identifier. In some DDL operations, the
FOREIGN KEY constraints are being loaded from the data dictionary
before the DDL transaction has been committed. For SYS_FOREIGN
and SYS_FOREIGN_COLS, we must implement the special case of
READ COMMITTED that the changes of the uncommitted current transaction
are visible.
dict_load_foreign(): Validate the table name. We could find a
SYS_FOREIGN.ID via a committed delete-marked secondary index record
that does not match the REF_NAME or FOR_NAME of the secondary index record.
dict_load_index_low(): Optionally take the table as a parameter,
so that table->def_trx_id can be updated in case of a
committed delete-marked SYS_INDEXES record corresponding
to DROP INDEX, but not corresponding to an index stub of ADD INDEX.
dict_load_indexes(): Do not update table->def_trx_id
in case of delete-marked records.
rec_is_metadata(), rec_offs_make_valid(), rec_get_offsets_func(),
row_build_low(): Relax some assertions. We may now have
!index->is_instant() even if a metadata record is present in the index.
Previously, the recovery of instant ADD/DROP COLUMN assumed
that READ UNCOMMITTED of the data dictionary will be performed.
Now, we will have a READ COMMITTED copy of the data dictionary
cache, and a READ UNCOMMITTED copy of the metadata record.
btr_page_reorganize_low(): Correctly update the FIL_PAGE_TYPE
when rolling back an instant ADD/DROP COLUMN operation.
row_rec_to_index_entry_impl(): Relax some assertions,
and disallow accessing "extra" fields. This fixes the recovery
of a crash during an instant ADD COLUMN after a successful
instant DROP COLUMN, in the test innodb.instant_alter_crash.
Tested by: Matthias Leich
InnoDB background statistics recalculation may acquire
a metadata also on the table itself, not only on the tables
that store the statistics.
Hence, it is better to disable InnoDB persistent statistics altogether.
This fixes up commit 9b8d9a1db3.
The autopkgtest was failing due to missing *.changes file. This is part
of source build, so revert autobake-deb.sh back to NOT using -b for
Gitlab-CI/Salsa-CI runs.
This bug affected queries with IN predicates that contain parameter markers
in the value list. Such queries are executed via prepared statements.
The problem appeared only if the number of elements in the value list
was greater than the set value of the system variable
in_predicate_conversion_threshold.
The patch unconditionally prohibits conversion of an IN predicate to the
equivalent IN predicand if the value list of the IN predicate contains
parameters markers.
Approved by Oleksandr Byelkin <sanja@mariadb.com>
from mysql.plugin table
Fix: Since mysql_upgrade runs commands from mysql_system_tables.fix,
added sql commands to check for semisync plugins in
INFORMATION_SCHEMA.PLUGINS and if they aren't there then delete them
from mysql.plugin.
Problem: In regular replication, when master binlogged using statement format
slave might not have written an event to its binary log when the Query
event aimed at a temporary table.
Specifically this was observed with LOAD DATA INFILE.
This effect was possible because unlike master slave holds temporary
tables in its pool and the master side check of existence of a
temporary table at the format bin-logging decision did not apply.
Solution: replace THD::has_thd_temporary_tables() with
THD::has_temporary_tables which allows to identify temporary table
presence on either side.
--
Reviewed by Andrei Elkin.
This commit adds a mtr test for reproducing a test scenario where despite of
innodb_disallow_writes blocking, writes to file system can still happen.
The test launches a garbd node, which triggers one of the cluster node to switch to
SST donor state. In this state, all disk activity should be halted, and e.g.
innodb_disallow_writes has been set. The test records md5sum aggregate over mariadb
data directory when the node enters the donor state, and records another md5sum
when the node leaves the donor state. If there is no IO activity in data directory, these
hashes should be equal.
For this test, the Donor state processing, has beeen instrumented so that, SST donor thread can be
stopped when entering the donor state. The test uses this new dbug sync point,
to control when to record the md5sums.
New SST script was added: wsrep_sst_backup, and garbd uses backup method to lauch the donor
node to call this script, and to enter in donor state.
The backup script could be later extended as general purpose backup method for the cluster.
This commit fixes also one race condition happening in wsrep_sst_rsync, like this:
* wsrep_rsync_sst script requests for flush tables,
and then waits in a loop until mariadbd has created file tables_flushed,
as confirmation that FLUSH TABLES has completed
* mariadbd's SST donor thread, wakes for the flush table request and then performs FTWRL,
and after this it creates the tables_flushed file
* note that SST script will now continue to startup rsync sending
* mariadbd's SST donor thread now calls for sst_disallow_writes(),
so that innodb would setup disk IO blockage, however rsyncing may already be ongoing at this point
This race condition is fixed in this commit, by performing all disk IO blocking before
creating the tables_flushed file.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
buf_pool_t::watch_unset(): Reorder some code so that
no warning will be emitted in CMAKE_BUILD_TYPE=RelWithDebInfo.
It is unclear why invoking watch_is_sentinel() before accessing
the block descriptor state would make the warning disappear.
New function to release latches till savepoint was added in mtr_t. As
there is no longer need to limit MDEV-20605 fix usage for locking reads
only, the limitation is removed.
Add requirement in Debian control file for libfmt 7.0
which is needed for building MariaDB.
This supports the SFORMAT function added in MDEV-25015.
+ autobake-deb.sh so old Debian/Ubuntu still
bundle the libfmt pulled in from upstream.
Closes#2062
While moving to a prescribed dependencies in MDEV-28011, an error was made
in the merge. The Ubuntu and Debian supported architectures of rocksdb-tools
are different and need to be treated as such.
This actually had no effect as our support of mariadb-plugin-rocksdb was never
different to the distro support of rocksdb-tools. Some notes where added
to this affect.
There is also nothing to do for Debian sid, and never should be.
The differentiation and grouping of distro codenames is for convenience in
merging upwards as more dependencies change across distro versions.
The fixing of versions rather than relying on apt-cache to be correct prevents
unstable changes between releases, and potentially uninstallable packages like
happened in MDEV-28014.
Correct comment about zstd to MDEV-16525
Problem:
========
When the mysql.gtid_slave_pos table uses the InnoDB engine, and
mysqld starts, it reads the table and begins a transaction. After
reading the value, it should end the transaction and release all
associated locks. The bug reported in DBAAS-7828 shows that when
autocommit is off, the locks are not released, resulting in
indefinite hangs on future attempts to change gtid_slave_pos. In
particular, the transaction was not properly finalized because
thd->server_status was not updated to reflect the end of the
transaction.
Solution:
========
This patch updates the code to properly commit the transaction after
reading gtid_slave_pos during mysqld start-up.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
Analysis: utf8 character set is now utf8mb3 or utf8mb4. So charset_master
is not able to find utf8 at the beginning of test. Hence it skips the tests
that use charset_master.
Fix: rename utf8 to utf8mb3 in charset_master.
buf_page_get_zip(): Do not perform a system call inside a
memory transaction. Instead, if the page latch is unavailable,
abort the memory transaction and let the fall-back code path
wait for the page latch.
buf_pool_t::watch_remove(): Return the previous state of the block.
buf_page_init_for_read(): Use regular stores for moving the
buffer fix count of watch_remove() to the new block descriptor.
A more extensive version of this was reviewed by Daniel Black
and tested with Intel TSX-NI by Axel Schwenke and Matthias Leich.
My assumption that regular loads and stores would execute faster
in a memory transaction than operations like std::atomic::fetch_add()
turned out to be incorrect.
Problem:
========
When using mariadb-binlog with --raw and --stop-never, events from
the master's currently active log file should be written to their
respective log file specified by --result-file, and shown on-disk.
There is a bug where mariadb-binlog does not flush the result file
to disk when new events are received
Solution:
========
Add a function call to flush mariadb-binlog’s result file after
receiving an event in --raw mode.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
Problem:
========
When both semi-sync and slave compression are enabled, the numbering
on packet headers can become out of sync between the primary and
replica servers. More specifically, after the master flushes its
write, it should increment the counters that track packets. The
bug is such that the master only updates the normal packet counter
and leaves the compressed packet counter alone.
Solution:
========
After the master flushes, additionally increment the compressed
packet counter.
Reviewed By:
============
Andrei Elkin: <andrei.elkin@mariadb.com>
The call mtr.add_suppression() that was added
in commit 75b7cd680b
for MemorySanitizer and Valgrind runs is causing
a result difference for the test rpl.rpl_gtid_stop_start.
Let us disable the binlog for executing that statement.
Also, the test perfschema.statement_program_lost_inst
would fail due to the changes to have_innodb.inc in this commit.
To compensate for that, we will make more --suite=perfschema
tests run without InnoDB, and explicitly enable InnoDB in
those tests that depend on a transactional storage engine.