An INSERT into a temporary table would fail to set the
index page as modified. If there were no other write operations
(such as UPDATE or DELETE) to the page, and the page was evicted,
we would read back the old contents of the page, causing
corruption or loss of data.
page_cur_insert_rec_write_log(): Call mtr_t::set_modified()
for temporary tables. Normally this is part of the mlog_open()
call, but the mlog_open() call was only present in debug builds.
This regression was caused by
commit 48192f963a
which was preparation for MDEV-11369 and supposed to affect
debug builds only.
Thanks to Thirunarayanan Balathandayuthapani for debugging.
When a table is renamed to an internal #sql2 or #sql-ib name during
a table-rebuilding DDL operation such as OPTIMIZE TABLE or ALTER TABLE,
and shortly after that a purge operation in an index on virtual columns
is attempted, the operation could fail, but purge would fail to release
the table reference.
innodb_acquire_mdl(): Release the reference if the table name is not
valid for acquiring a meta-data lock (MDL).
innodb_find_table_for_vc(): Add a debug assertion if the table name
is not valid. This code path is for DML execution. The table
should have a valid name for executing DML, and furthermore a MDL
will prevent the table from being renamed.
row_vers_build_clust_v_col(): Add a debug assertion that both indexes
must belong to the same table.
After iterating all fields and setting PART_INDIRECT_KEY_FLAG as
necessary, TABLE::mark_columns_used_by_virtual_fields() remembers
in TABLE_SHARE that this operation was done and need not be repeated.
But as the flag is set in TABLE_SHARE, PART_INDIRECT_KEY_FLAG must
be set in TABLE_SHARE::field[], not only in TABLE::field[].
Otherwise all new TABLEs opened from this TABLE_SHARE will
never have it.
My conflict resolution for the script did not work out after all,
and apparently I was testing a wrong version. Revert MDEV-15511
from MariaDB 10.2 for now.
trx_purge_add_update_undo_to_history(): Relax the too strict assertion
by removing the condition on srv_fast_shutdown (innodb_fast_shutdown).
Rollback is allowed during any form of shutdown.
buf_dump(): Only generate the output when shutdown is in progress.
log_write_up_to(): Only generate the output before actually writing
to the redo log files.
srv_purge_should_exit(): Rate-limit the output, and instead of
displaying the work done, indicate the work that remains to be done
until the completion of the slow shutdown.
The bug appears because of the wrong pushdown into the WHERE clause of the
materialized derived table/view work. For the excl_dep_on_grouping_fields()
method that checks if the condition can be pushed into the WHERE clause
the case when Item_cond is used is missing. For Item_cond elements this
method always returns positive result (that condition can be pushed).
So this condition is pushed even if is shouldn't be pushed.
To fix it new Item_cond::excl_dep_on_grouping_fields() method is added.
using INSERT INTO
This patch allows condition pushdown into a materialized derived / view when
this table is used in INSERT SELECT, multi-table UPDATE and multi-table DELETE.
This patch introduces support for the system variable eq_range_index_dive_limit
that existed in MySQL starting from 5.6. The variable sets a limit for
index dives into equality ranges. Index dives are performed by optimizer
to estimate the number of rows in range scans. Index dives usually provide
good estimate but they are pretty expensive. To estimate the number of rows
in equality ranges statistical data on indexes can be employed. Its usage gives
not so good estimates but it's cheap. So if the number of equality dives
required by an index scan exceeds the set limit no dives for equality
ranges are performed by the optimizer for this index.
As the new system variable is introduced in a stable version the default
value for it is set to a special value meaning there is no limit for the number
of index dives performed by the optimizer.
The patch partially uses the MySQL code for WL 5957
'Statistics-based Range optimization for many ranges'.
The MySQL 5.7 TRUNCATE TABLE is inherently incompatible
with hot backup, because it is creating and deleting a separate
log file, and it is not writing redo log for all changes of the
InnoDB data dictionary tables. Refuse to create a corrupted backup
if the unsafe form of TRUNCATE was executed.
Note: Undo log tablespace truncation cannot be detected easily.
Also it is incompatible with backup, for similar reasons.
xtrabackup_backup_func(): "Subscribe to" the log events before
the first invocation of xtrabackup_copy_logfile().
recv_parse_or_apply_log_rec_body(): If the function pointer
log_truncate is set, invoke it to report MLOG_TRUNCATE.
Remove the logic for skipping the test if a log checkpoint occurred,
and the logic for tolerating failures. Thanks to MDEV-16791,
MLOG_INDEX_LOAD is supposed to always work.
Amend commit b853b4fd88
that was reverted in commit 29150e2391.
recv_parse_log_recs(): Do check for corrupted redo log or file
system before checking for len==0, but only read *ptr if
it is not past the end of the buffer (end_ptr).
recv_parse_log_rec(): Report incorrect redo log type
in a consistent way with recv_parse_or_apply_log_rec_body().
This is a follow-up to commit f30c5af42e.
The Pool poisoning that was introduced in MDEV-15030 introduced
race conditions in AddressSanitizer builds, because concurrent
poisoning and unpoisoning were not prevented by any synchronization
primitive.
Pool::get(): Protect the unpoisoning by m_lock_strategy.
Pool::mem_free(): Protect the poisoning by m_lock_strategy.
Pool::putl(): Renamed from put(), because now the caller is
responsible for invoking m_lock_strategy.
If trx_free() and trx_create_low() were called while a call to
trx_reference() was pending, we could get a reference to a wrong
transaction object.
trx_reference(): Return NULL if the trx->id no longer matches.
lock_trx_release_locks(): Assign trx->id = 0, so that trx_reference()
will not return a reference to this object.
trx_cleanup_at_db_startup(): Assign trx->id = 0.
assert_trx_is_free(): Assert !trx->n_ref. Assert trx->id == 0,
now that it will be cleared as part of a transaction commit.
Allocate trx->lock.rec_pool and trx->lock.table_pool directly from trx_t.
Remove unnecessary use of std::vector.
In order to do this, move some definitions from lock0priv.h to
lock0types.h, so that ib_lock_t will not be an opaque type.
Item_subselect::is_expensive() used to return FALSE (Inexpensive) whenever
it saw that one of SELECTs in the Subquery's UNION is degenerate. It
ignored the fact that other parts of the UNION might not be inexpensive,
including the case where pther parts of the UNION have no query plan yet.
For a subquery in form col >= ANY (SELECT 'foo' UNION SELECT 'bar')
this would cause the query to be considered inexpensive when there is
no query plan for the second part of the UNION, which in turn would
cause the SELECT 'foo' to compute and free itself while still inside
JOIN::optimize for that SELECT (See MDEV comment for full description).
If we have a 2+ node cluster which is replicating from an async master
and the binlog_format is set to STATEMENT and multi-row inserts are executed
on a table with an auto_increment column such that values are automatically
generated by MySQL, then the server node generates wrong auto_increment
values, which are different from what was generated on the async master.
The causes and fixes:
1. We need to improve processing of changing the auto-increment values
after changing the cluster size.
2. If wsrep auto_increment_control switched on during operation of
the node, then we should immediately update the auto_increment_increment
and auto_increment_offset global variables, without waiting of the next
invocation of the wsrep_view_handler_cb() callback. In the current version
these variables retain its initial values if wsrep_auto_increment_control
is switched on during operation of the node, which leads to inconsistent
results on the different nodes in some scenarios.
3. If wsrep auto_increment_control switched off during operation of the node,
then we must return the original values of the auto_increment_increment and
auto_increment_offset global variables, as the user has set. To make this
possible, we need to add a "shadow copies" of these variables (which stores
the latest values set by the user).
aws_key_management needs current directory to be datadir during
initalization, it scans current directory for encrypted keys.
Fix is to ensure, that plugin initialization in mariabackup happens
after the call to my_setwd(mysql_real_data_home).
The test causes simulated server crashes with DBUG_SUICIDE();.
It also relies on transactions that were committed right before the
crash to be visible after the crash (that is, it requires durability).
Run the test with transaction durability enabled: set
rocksdb-flush-log-at-trx-commit=1.
truncate incorrect values in convert_period_to_month() so that
PERIOD_DIFF never returns a value outside of 2^23 range.
And, for safety, increase buffer sizes for int10_to_str
to be sufficienly big for any int10_to_str result.
The package libmariadbclient18 contains the dialog.so plugin, which also
the new libmariadb3 ships. As they both use the exact same path the latter
must be marked as a with Breaks and Replaces relations ship.
Note: This fix is conservative hack for stable releases 10.2 and 10.3.
In 10.4, the development release at the time, we will clean up how the
libmariadb3 packaging and it's -compat packages are done to match that
what is done in downstream Debian official.
recv_parse_log_recs(): Do not check for corruption before
checking for end-of-log-buffer. For some reason, adding the
check to the logical-looking place would cause intermittent
recovery failures in the tests innodb.innodb-index and
innodb_gis.rtree_compress2.
recv_parse_log_recs(): Check for corruption before checking for
end-of-log-buffer.
mlog_parse_initial_log_record(), page_cur_parse_delete_rec():
Flag corruption for out-of-bounds values, and let the caller
dump the corrupted redo log extract.
If recv_sys_justify_left_parsing_buf() has been invoked, it is possible
that recv_previous_parsed_rec_offset is after the current offset.
In this case, we must not dump any bytes before the current record.
If the LOG_BLOCK_HDR_DATA_LEN field is corrupted, scanning the
log records could fail in strange ways. It is better to validate
the field as part of validating each log block.