row_ins_clust_index_entry_low(): Do not call dtuple_t::trim()
before row_ins_clust_index_entry_by_modify(), so that the values
of all columns will be available in row_upd_build_difference_binary().
If applicable, the tuple can be trimmed in btr_cur_optimistic_update()
or btr_cur_pessimistic_update(), which will be called by
row_ins_clust_index_entry_by_modify().
The setting innodb_safe_truncate=ON reduces compatibility with older
versions of MariaDB and backup tools in two ways.
First, we will be writing TRX_UNDO_RENAME_TABLE records, which older
versions do not know about. These records could be misinterpreted if
a DDL transaction was recovered and would be rolled back.
Such rollback is only possible if the server was killed while
an incomplete DDL transaction was persisted. On transaction completion,
the insert_undo log pages would only be repurposed for new undo log
allocations, and their contents would not matter. So, older versions
will not have a problem with innodb_safe_truncate=ON if the server was
shut down cleanly.
Second, to prevent such recovery failure, innodb_safe_truncate=ON will
cause a modification of the redo log format identifier, which will
prevent older versions from starting up after a crash. MariaDB Server
versions older than 10.2.13 will refuse to start up altogether, even
after clean shutdown.
A server restart with innodb_safe_truncate=OFF will restore compatibility
with older server and backup versions.
The syntax error happened because we had not implemented a different print for
percentile functions. The syntax is a bit different when we use percentile functions
as window functions in comparision to normal window functions.
Implemented a seperate print function for percentile functions
row_build_index_entry_low(): Assert that when the value of a
virtual column is not available, this can only happen when
the index creation was completed but not committed yet.
This change is not fixing any bug, making a debug assertion
stricter, so that bugs can be caught in the future.
Ultimately, we should change the InnoDB undo log format so that
all actual secondary index keys are stored there, also for
virtual or spatial indexes. In that way, purge and rollback would
be more straightforward.
in Field_iterator_table::create_item
When IN predicate is converted to IN subquery we have to ensure that
any item from the select list of the subquery has some name and this name
is unique across the select list.
This was not guaranteed by the code before the patch for MDEV-17222.
If the name of an item of the select list was not set, and this happened
for binary constants, then the server crashed. If the first row in the IN
list contained the same constant in two different positions then the server
returned an error message.
This was fixed by providing all constants in the first row of the IN list
with generated names.
When we have a query which has implicit_grouping then we are sure that we would end up with only one
row so there is no point to do DISTINCT computation
upon INSERT .. SELECT
The function Item *Item_direct_view_ref::derived_field_transformer_for_where()
erroneously did not strip off ref wrappers from references to materialized
derived tables / views. As a result the expressions that contained some
references of the type Item_direct_view_ref to columns of a materialized
derived table / view V were pushed into V incorrectly. This could cause
crashes for some INSERT ... SELECT statements.
A prepared backup from Mariabackup does not really need to contain any
redo log file, because all log will have been applied to the data files.
When the user copies a prepared backup to a data directory (overwriting
existing files), it could happen that the data directory already contained
redo log files from the past. mariabackup --copy-back) would delete the
old redo log files, but a user’s own copying script might not do that.
To prevent corruption caused by mixing an old redo log file with data
files from a backup, starting with MDEV-13311, Mariabackup would create
a zero-length ib_logfile0 that would prevent startup.
Actually, there is no need to prevent InnoDB from starting up when a
single zero-length file ib_logfile0 is present. Only if there exist
multiple data files of different lengths, then we should refuse to
start up due to inconsistency. A single zero-length ib_logfile0 should
be treated as if the log files were missing: create new log files
according to the configuration.
open_log_file(): Remove. There is no need to open the log files
at this point, because os_file_get_status() already determined
the size of the file.
innobase_start_or_create_for_mysql(): Move the creation of new
log files a little later, not when finding out that the first log
file does not exist, but after finding out that it does not exist
or it exists as a zero-length file.
Rename the 10.2-specific configuration option innodb_unsafe_truncate
to innodb_safe_truncate, and invert its value.
The default (for now) is innodb_safe_truncate=OFF, to avoid
disrupting users with an undo and redo log format change within
a Generally Available (GA) release series.
We keep the MySQL 5.7 backup-incompatible TRUNCATE TABLE
only in MariaDB Server 10.2. In 10.3 and later releases,
only the backup-friendly TRUNCATE will be available.
While MariaDB Server 10.2 is not really guaranteed to be compatible
with Percona XtraBackup 2.4 (for example, the MySQL 5.7 undo log format
change that could be present in XtraBackup, but was reverted from
MariaDB in MDEV-12289), we do not want to disrupt users who have
deployed xtrabackup and MariaDB Server 10.2 in their environments.
With this change, MariaDB 10.2 will continue to use the backup-unsafe
TRUNCATE TABLE code, so that neither the undo log nor the redo log
formats will change in an incompatible way.
Undo tablespace truncation will keep using the redo log only. Recovery
or backup with old code will fail to shrink the undo tablespace files,
but the contents will be recovered just fine.
In the MariaDB Server 10.2 series only, we introduce the configuration
parameter innodb_unsafe_truncate and make it ON by default. To allow
MariaDB Backup (mariabackup) to work properly with TRUNCATE TABLE
operations, use loose_innodb_unsafe_truncate=OFF.
MariaDB Server 10.3.10 and later releases will always use the
backup-safe TRUNCATE TABLE, and this parameter will not be
added there.
recv_recovery_rollback_active(): Skip row_mysql_drop_garbage_tables()
unless innodb_unsafe_truncate=OFF. It is too unsafe to drop orphan
tables if RENAME operations are not transactional within InnoDB.
LOG_HEADER_FORMAT_10_3: Replaces LOG_HEADER_FORMAT_CURRENT.
log_init(), log_group_file_header_flush(),
srv_prepare_to_delete_redo_log_files(),
innobase_start_or_create_for_mysql(): Choose the redo log format
and subformat based on the value of innodb_unsafe_truncate.
The test is shutting down InnoDB, corrupting a file, and finally
restarting InnoDB. Before the shutdown, the test created the table
and inserted some records. Before MDEV-12288, there would be no access
to the table after server restart, but after MDEV-12288 purge would
reset the transaction identifier after the INSERT, and this would
sometimes happen after the restart.
To make the test deterministic, wait for purge to complete before the
shutdown.
The bug appears as a slave SQL thread hanging in
rpl_parallel_thread_pool::get_thread() while there are no slave worker
threads to awake it.
The reason of the hang is that at the parallel slave worker pool
activation the being stared SQL thread could read the worker pool size
concurrently with pool deactivation. At reading the SQL thread did not
employ necessary protection from a race.
Fixed with making the SQL thread at the pool activation first
to grab the same lock as potential deactivator also does prior
to access the pool size.
table for purge thread
Problem:
=======
Purge tries to fetch mdl lock for the whole table even though it tries
to open one of the partition. But table name length was wrongly set to indicate
the partition name too.
Solution:
========
- Table name length should identify the table name only not the partition name.
table for purge thread
Problem:
=======
Purge tries to fetch mdl lock for the whole table even though it tries
to open one of the partition. But table name length was wrongly set to indicate
the partition name too.
Solution:
========
- Table name length should identify the table name only not the partition name.
derived table / view by equality
Now rows of a materialized derived table are always put into a
temporary table before join operation. If BNLH is used to join this
table with the result of a partial join then both operands of the
join are actually put into main memory. In most cases this is not
efficient.
We could avoid this by sending the rows of the derived table directly
to the join operation. However this kind of data flow is not supported
yet.
Fixed by not allowing usage of hash join algorithm to join a materialized
derived table if it's joined by an equality predicate of the form
f=e where f is a field of the derived table.
Change for the test case in 10.3: splitting must be turned off to preserve
the explain.
derived table / view by equality
Now rows of a materialized derived table are always put into a
temporary table before join operation. If BNLH is used to join this
table with the result of a partial join then both operands of the
join are actually put into main memory. In most cases this is not
efficient.
We could avoid this by sending the rows of the derived table directly
to the join operation. However this kind of data flow is not supported
yet.
Fixed by not allowing usage of hash join algorithm to join a materialized
derived table if it's joined by an equality predicate of the form
f=e where f is a field of the derived table.
derived table / view by equality
Now rows of a materialized derived table are always put into a
temporary table before join operation. If BNLH is used to join this
table with the result of a partial join then both operands of the
join are actually put into main memory. In most cases this is not
efficient.
We could avoid this by sending the rows of the derived table directly
to the join operation. However this kind of data flow is not supported
yet.
Fixed by not allowing usage of hash join algorithm to join a materialized
derived table if it's joined by an equality predicate of the form
f=e where f is a field of the derived table.
This would happen especially in optimistic parallel replication, where there
is a good chance that a transaction will be rolled back (due to conflicts)
after it has executed record_gtid(). If the transaction did any deletions of
old rows as part of record_gtid(), those deletions will be undone as well.
And the code did not properly ensure that the deletions would be re-tried.
This patch makes record_gtid() remember the list of deletions done as part
of a transaction. Then in rpl_slave_state::update() when the changes have
been committed, we discard the list. However, in case of error and rollback,
in cleanup_context() we will instead put the list back into
rpl_global_gtid_slave_state so that the deletions will be re-tried later.
Probably fixes part of the cause of MDEV-12147 as well.
Signed-off-by: Kristian Nielsen <knielsen@knielsen-hq.org>