Apply the changes to InnoDB and XtraDB that had been
inadvertently skipped in the merge
commit ae476868a5
That merge failure sabotaged part of MDEV-20127:
>Revert a problematic auto_increment_increment 'fix' from 2014.
>This involves replacing the MDEV-8827 fix and in 10.1,
>removing some WSREP instrumentation.
The code changes were re-merged manually by executing the following:
# Get the parent of the problematic merge.
git checkout ae476868a5394041a00e75a29c7d45917e8dfae8^
# Perform the merge again.
git merge ae476868a5394041a00e75a29c7d45917e8dfae8^2
# Get the conflict resolution from that merge.
git checkout ae476868a5 .
# Note: Any changes to these files were removed (empty diff)!
git diff HEAD storage/{innobase,xtradb}/handler/ha_innodb.cc
# Apply the code changes:
git diff cf40393471b10ca68cc1d2804c22ab9203900978^2..MERGE_HEAD \
storage/{innobase,xtradb}/handler/ha_innodb.cc|
patch -p1
Basicaly it's an uninitialized read. 165 is 0xa5 which comes from TRASH_ALLOC()
Fix by calling a class ctor which initializes problematic
TMP_TABLE_PARAM::force_copy_fields field
Ever since MariaDB 10.0 (and MySQL 5.6.8), the innodb_log_file_size
and innodb_log_files_in_group can be changed between server restarts,
and the redo log files will be resized on server startup if needed.
- fts_optimize_thread() uses dict_table_t object instead of table id.
So that it doesn't acquire dict_sys->mutex. It leads to remove the
hang of dict_sys->mutex between fts_optimize_thread() and other threads.
- in_queue to indicate whether the table is in fts_optimize_queue. It
is protected by fts_optimize_wq->mutex to avoid any race condition.
- fts_optimize_init() adds the fts table to the fts_optimize_wq
InnoDB stores synced_doc_id + 1 value in FTS_CONFIG table. But
while reading the synced doc id from FTS_CONFIG table after restart,
InnoDB should read synced_doc_id - 1 to get the actual synced
doc id value.
using a specially crafted strings one could overflow `shift`
variable and cause a crash by dereferencing d10[-2147483648]
(on a sufficiently old gcc).
This is a correct fix and a test case for
Bug #29723340: MYSQL SERVER CRASH AFTER SQL QUERY WITH DATA ?AST
Old InnoDB/XtraDB versions only initialized FIL_PAGE_TYPE for
B-tree pages (to FIL_PAGE_INDEX), and left it uninitialized
(possibly containing FIL_PAGE_INDEX) for others. In MySQL
or MariaDB 5.5, the field is initialized on almost all pages,
but still not all of them.
In MariaDB 10.2 and later, buf_flush_init_for_writing() would
initialize the FIL_PAGE_TYPE on such old pages, but only after
passing the debug assertion that we are now removing from 10.1.
There, we will be able to modify fil_crypt_rotate_page() so
that it will skip the key rotation for pages that contain 0
in FIL_PAGE_TYPE.
In MariaDB 10.1, there is no logic that would initialize
FIL_PAGE_TYPE on data pages in old data files after an update.
So, encryption key rotation may routinely cause page flushes
on pages that contain 0 in FIL_PAGE_TYPE.
When the mysqld_multi script passes the --defaults-group-suffix
option to mysqld, it must remove the initial substring with the
group name ("mysqld") from option value, because otherwise substring
"mysqld" will be added to the group name and then the group name
will contain the word "mysqld" twice, which is wrong, because
mysqld itself adds the suffix received to the group name.
buf_flush_init_for_writing(): Assert that FIL_PAGE_TYPE is set
except when creating a new data file with a dummy first page.
buf_dblwr_create(): Ensure that FIL_PAGE_TYPE on all pages
will be initialized. Reset buf_dblwr_being_created at the end.
In the function recv_parse_or_apply_log_rec_body() there are debug checks
for validating the state of the page when redo log records are being
applied. Most notably, FIL_PAGE_TYPE should be set before anything else
is being written to the page.
ibuf_add_free_page(): Set FIL_PAGE_TYPE before performing any other changes.
Apply the correct pattern for debug instrumentation:
SET @save_dbug=@@debug_dbug;
SET debug_dbug='+d,...';
...
SET debug_dbug=@save_dbug;
Numerous tests use statements of the form
SET debug_dbug='-d,...';
which will inadvertently enable all DBUG tracing output,
causing unnecessary waste of resources.
The test main.index_merge_innodb is taking very much time,
especially on later versions (10.2 and 10.3).
Some of this could be attributed to the use of INSERT...SELECT,
which is time-consumingly creating explicit record locks in InnoDB
for the locking read in the SELECT part.
In 10.3 and later, some slowness can be attributed to MDEV-12288,
which makes the InnoDB purge thread spend time to reset transaction
identifiers in the inserted records. If we prevent purge from
running before all tables are dropped, the test seems to be
10% faster on an unoptimized debug build on 10.5. (A proper fix
would be to implement MDEV-515 and stop writing row-level undo log
records for inserts into an empty table or partition.)
At the same time, it should not hurt to make main.index_merge_myisam
to use the sequence engine. Not only could it be a little faster,
but the test would be slightly more readable.
read_statistics_for_tables_if_needed
Regression after 279a907, read_statistics_for_tables_if_needed() was
called after open_normal_and_derived_tables() failure.
Fixed by moving read_statistics_for_tables() call to a branch of
get_schema_stat_record() where result of open_normal_and_derived_tables()
is checked.
Removed THD::force_read_stats, added read_statistics_for_tables() instead.
Simplified away statistics_for_command_is_needed().
Analysis:
========
In general if there are three groups.
1 - Inserts 32 which fails due to local entry '32' on slave.
2 - Inserts 33
3 - Inserts 34
Each group considers itself as a waiter and it waits for prior group 'waitee'.
This is done in 'register_wait_for_prior_event_group_commit'. If there is no
other parallel group being scheduled then no waitee will be there.
Let us assume 3 groups are being scheduled in parallel.
3-> waits for 2-> waits for->1
'1' upon completion it checks is there any registered subsequent waiter. If
so it wakes up the subsequent waiter with its execution status. This execution
status is stored in wakeup_error.
If '1' failed then it sends corresponding wakeup_error to 2. Then '2' aborts
and it propagates error to '3'. So all further commits are aborted. This
mechanism works only when all transactions reach a stage where they are
waiting for their prior commit to complete.
In case of optimistic following scenario occurs.
1,2,3 are scheduled in parallel.
3 - Reaches group_commit_code waits for 2 to complete.
1 - errors out sets stop_on_error_sub_id=1.
When a group execution results in error its corresponding sub_id is set to
'stop_on_error_sub_id'. Any new groups queued for execution will check if
their sub_id is > stop_on_error_sub_id. If it is true their execution will be
skipped as prior group execution failed. 'skip_event_group=1' will be set.
Since the execution of SQL thread is about to stop we just skip execution of
all the following event groups. We still do all the normal waiting and wakeup
processing between the event groups as a simple way to ensure that everything
is stopped and cleaned up correctly.
Upon error '1' transaction checks for registered waiters. Since no one is
there it simply goes away.
2 - Starts the execution. It checks do I have a waitee.
Since wait_commit_sub_id == entry->last_committed_sub_id no waitee is set.
Secondly: 'entry->stop_on_error_sub_id' is set by '1'st execution. Now
'handle_parallel_thread' code checks if the current group 'sub_id' is greater
than the 'sub_id' set within 'stop_on_error_sub_id'.
Since the above is true 'skip_event_group=true' is set. Simply call
'wait_for_prior_commit' to wakeup all waiters. Group '2' didn't had any
waitee and its execution is skipped. Hence its wakeup_error=0.It sends a
positive wakeup signal to '3'. Which commits. This results in a missed
transaction. i.e 33 is missed and 34 is committed.
Fix:
===
When a worker learns that an earlier transaction execution has failed, and it
should not proceed for further execution, it should mark its own execution
status as failed so that it alerts its followers to abort as well.
chkconfig --add and --del [might] invoke /sbin/insserv
and even if chkconfig exists, insserv might not (SLES15).
Ignore chkconfig --del errors - it's a "best effort" cleanup anyway
A syntax error in the mysqld_multi.sh script has been fixed
here + a "--defaults-group-suffix" option has been moved to
the top of the mysqld options list.
Eliminate one InnoDB table with 128*16384 rows, and use
the sequence engine instead. Also, run everything in a single
transaction, to prevent purge from running concurrently
unnecessarily. (Starting with MariaDB Server 10.3, purge would
reset the DB_TRX_ID after INSERT.)
Also fixes:
MDEV-20560 Assertion `precision > 0' failed in decimal_bin_size upon SELECT with MOD short unsigned decimal
Changing the way how Item_func_mod calculates its max_length.
It now uses decimal_precision(), decimal_scale() and unsigned_flag
of its arguments, like all other Item_num_op descendants do.
In the function test_if_cheaper_ordering we make a decision if using an index is better than
using filesort for ordering. If we chose to do range access then in test_quick_select we
should make sure that cost for table scan is set to DBL_MAX so that it is not picked.
The unit files made systemd print:
systemd[1]: Started MariaDB 10.3.13 database server (multi-instance).
Let's add the instance name, so starting mariadb@foo.service
makes it print:
systemd[1]: Started MariaDB 10.3.13 database server (multi-instance foo).
mysqld_safe --dry-run needs to either call exit or return, depending if
it is being sourced or not, otherise return can lead to the error:
return: can only `return' from a function or sourced script
The original fix suggestion was proposed by FaramosCZ <mschorm@centrum.cz>
* Change the comments in mysql-log-rotate.sh to refer to mysqld, not mysqld_safe
as that's what most distros are using.
* Change err-log to log-error as err-log is no longer valid.
* Convert tab to space for consistency.
Fix build error with newer cmake
Fixes the following build error:
CMake Error at cmake/os/Linux.cmake:29 (STRING):
STRING sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
CMakeLists.txt:101 (INCLUDE)
CMake Error at cmake/os/Linux.cmake:29 (STRING):
STRING sub-command REPLACE requires at least four arguments.
Call Stack (most recent call first):
CMakeLists.txt:101 (INCLUDE)
The error happens when CMAKE_SHARED_LINKER_{LANG}_FLAGS is not set.
Force the variable to be set to "" as input to prevent this.
Signed-off-by: Ryan Coe <bluemrp9@gmail.com>
Signed-off-by: Vicențiu Ciorbaru <vicentiu@mariadb.org>
Fixes bug introduced in commit 5415002.
Using script run-time filename does not always work. One cannot assume
that the filename is always the same as there might be temporary file
names used by dpkg in certain situations. See Debian #920415.
The same fix has been successfully in use in Debian official packages
since February 2019:
https://salsa.debian.org/mariadb-team/mariadb-10.3/commit/6440c0d6e75
Problem:
=======
During dropping of fts index, InnoDB waits for fts_optimize_remove_table()
and it holds dict_sys->mutex and dict_operaiton_lock even though the
table id is not present in the queue. But fts_optimize_thread does wait
for dict_sys->mutex to process the unrelated table id from the slot.
Solution:
========
Whenever table is added to fts_optimize_wq, update the fts_status
of in-memory fts subsystem to TABLE_IN_QUEUE. Whenever drop index
wants to remove table from the queue, it can check the fts_status
to decide whether it should send the MSG_DELETE_TABLE to the queue.
Removed the following functions because these are all deadcode.
dict_table_wait_for_bg_threads_to_exit(),
fts_wait_for_background_thread_to_start(),fts_start_shutdown(), fts_shudown().