1. Removed TIMESTAMP/TRANSACTION unit auto-detection in favor of default TIMESTAMP.
Reasons:
1.1. rare practical use and doubtful advantage of such auto-detection;
1.2. it conflicts with MDEV-16226 (TRX_ID-based versioned tables performance improvement).
Needless check_unit membership removed.
2. SQL: versioning type handling refactoring
Vers_type_handler hierarchy stores versioning properties of type.
virtual Type_handler::vers() accesses specialization of
Vers_type_handler for specific type.
virtual Vers_type_handler::kind() returns versioning kind
(timestamp/trx_id).
Removed Type_handler::Vers_history_point_check_unit() in favor of
Type_handler::vers().
Renames:
require_timestamp() -> require_timestamp_error()
require_trx_id() -> require_trx_id_error()
EDIT by Alexander Barkov (@abarkov):
check_sys_fields() moved to Vers_type_handler::check_sys_fields()
For release builds, do not declare unused variables.
unpack_row(): Omit a debug-only variable from WSREP diagnostic message.
create_wsrep_THD(): Fix -Wmaybe-uninitialized for the PSI_thread_key.
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34
Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.
Let us assume 3 groups are being scheduled in parallel.
3-> waits for 2-> waits for->1
'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.
If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'. So all further commits are aborted. This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.
In case of optimistic following scenario occurs.
1,2,3 are scheduled in parallel.
3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.
When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id. If it is true their execution will be
skipped as prior group execution failed. 'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups. We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.
Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.
2 - Starts the execution. It checks do I have a waitee.
Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.
Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution. Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.
Since the above is true 'skip_event_group=true' is set. Simply call
'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any
waitee and its execution is skipped. Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.
Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
mark big_tables deprecated, the server can put temp tables on disk
as needed avoiding "table full" errors.
in case someone would really need to force a tmp table to be created
on disk from the start and for testing allow tmp_memory_table_size
to be set to 0.
fix tests to use that instead (and add a test that it actually
works).
make sure in-memory TREE size limit is never 0 (it's [ab]using
tmp_memory_table_size at the moment)
remove few sys_vars.*_basic tests
A syntax error in the mysqld_multi.sh script has been fixed
here + a "--defaults-group-suffix" option has been moved to
the top of the mysqld options list.
Thanks to Eugene Kosov for noting that the fix is incomplete.
It turns out that on instant DROP/reorder column (MDEV-15562),
we must always write the metadata record, even though the table
was empty. Alternatively, we should guarantee that all undo
log records for the table have been purged. (Attempting to do
that by updating table_id leads to other problems; see
commit 1b31d8852c00b4bab6e6fe179b97db45ccb8d535.)
It would be tempting to remove dict_index_t::clear_instant_alter()
altogether, but it turns that we need that when the instant ALTER TABLE
operation of a first-time DROP COLUMN is being rolled back.
innobase_instant_try(): Clarify a comment. Purge never calls
dict_index_t::clear_instant_alter(), but it may invoke
dict_index_t::clear_instant_add(). On first-time instant DROP/reorder,
always write a metadata record, even if the table is empty.
The test encryption.innodb-redo-badkey was accidentally disabled
until commit 23657a2101 enabled
it recently. Once it was enabled, it started failing randomly.
recv_recover_corrupt_page(): Do not assume that any redo log exists
for the page. A page may be unnecessarily read by read-ahead.
When noting the corruption, reset recv_addr->state to RECV_PROCESSED,
so that even if the same page is re-read again, we will only
decrement recv_sys->n_addrs once.
chkconfig --add and --del [might] invoke /sbin/insserv
and even if chkconfig exists, insserv might not (SLES15).
Ignore chkconfig --del errors - it's a "best effort" cleanup anyway
The test innodb_fts.fulltext_table_evict was only creating 1000 tables
with fulltext indexes, only to check that no tables with fulltext
indexes are being evicted.
The reason why tables containing fulltext indexes cannot be evicted is
that fts_optimize_init() invokes dict_table_prevent_eviction().
For CMAKE_BUILD_TYPE=Debug, the default MYSQL_MAINTAINER_MODE=AUTO
implies -Werror along with other flags in cmake/maintainer.cmake,
which would break the debug builds when CMAKE_CXX_FLAGS include -O2.
This fix includes a backport of 6dd3f24090
from MariaDB 10.3.
The crash scenario is as follows:
(1) A non-empty table exists.
(2) MDEV-15562 instant ADD/DROP/reorder has been invoked.
(3) Some purgeable undo log exists for the table.
(4) The table becomes empty, containing not even any delete-marked records,
only containing the hidden metadata record that was added in (2).
(5) An instant ADD/DROP/reorder column is executed, and the table
is emptied and the (2) metadata removed.
(6) Purge processes an undo log record from (3), which will refer to
a non-existent clustered index field, because the metadata that
was created in (2) was remoeved in (5).
We fix this by adjusting step (5) so that we will never remove the
MDEV-15562-style metadata record. Removing the MDEV-11369 metadata
record (instant ADD COLUMN to the end of the table) is completely
fine at any time when the table becomes empty, because
dict_index_t::n_fields will remain unchanged.
innobase_instant_try(): Never remove the MDEV-15562 metadata record.
page_cur_delete_rec(): Do not reset FIL_PAGE_TYPE when the
MDEV-15562 metadata record is being removed as part of
btr_cur_pessimistic_update() invoked by innobase_instant_try().
Always initialize ScopedStatementReplication::saved_binlog_format,
so that GCC cannot emit a bogus warning about
ScopedStatementReplication::~ScopedStatementReplication() using the
variable.
The code was originally introduced in
commit d998da0306.
lock_print_info::operator(): Do not dereference purge_sys.query in case
it is NULL. We would not initialize purge_sys if innodb_force_recovery
is set to 5 or 6.
The test case will be added by merge from 10.2.
Test innodb_read_only startup (which will be refused after a crash),
and test also innodb_force_recovery=5, and extract some change buffer
merge statistics. Omit any statistics about delete (purge) buffering,
because purge could happen at any time.
Use the sequence storage engine for populating the table.
A syntax error in the mysqld_multi.sh script has been fixed
here + a "--defaults-group-suffix" option has been moved to
the top of the mysqld options list.
In MariaDB 10.4.0, commit 09af00cbde
removed the crash-upgrade logic for the MariaDB 10.2
innodb_safe_truncate=OFF TRUNCATE TABLE (which was the only option
between MariaDB 10.2.2 and 10.2.18), but failed to adjust some
comments and code.
buf_page_io_complete(): Remove a bogus comment about TRUNCATE.
dict_recreate_index_tree(): Unused function; remove.
fil_space_t::stop_new_ops: Clarify the comment.
fil_space_acquire_low(): Remove a bogus comment about TRUNCATE.
fil_check_pending_ops(), fil_check_pending_io(): Adjust a warning message.
This code is only invoked as part of DISCARD TABLESPACE or DROP TABLE.
DROP TABLE is internally used as part of ALTER TABLE, OPTIMIZE TABLE,
or TRUNCATE TABLE.
RemoteDatafile::create_link_file(): Clarify a comment.
ibuf_delete_for_discarded_space(): Clarify the function comment.
dict_table_x_lock_indexes(), dict_table_x_unlock_indexes():
Merge with the only remaining caller, row_quiesce_set_state().
page_create_zip(): Remove a bogus comment about TRUNCATE.
Eliminate one InnoDB table with 128*16384 rows, and use
the sequence engine instead. Also, run everything in a single
transaction, to prevent purge from running concurrently
unnecessarily. (Starting with MariaDB Server 10.3, purge would
reset the DB_TRX_ID after INSERT.)
The arg was introduced as part of 75bcf1f9ad
to fix a SELinux problem caused by mysqld_safe accessing files it should
not be via the my_which function.
The root cause for this was fixed in 10.3, via
355ee6877b which eliminated the my_which
function from mysqld_safe entirely. Thus, in 10.3, this --basedir flag
is not necessary.