Commit graph

971 commits

Author SHA1 Message Date
Vlad Lesin
96ae37abc5 MDEV-30658 lock_row_lock_current_waits counter in information_schema.innodb_metrics may become negative
MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT monitor should has
MONITOR_DISPLAY_CURRENT flag set in its definition, as it shows the
current state and does not accumulate anything.

Reviewed by: Marko Mäkelä
2023-10-05 18:27:54 +03:00
Marko Mäkelä
02878f128e MDEV-31813 SET GLOBAL innodb_max_purge_lag_wait hangs if innodb_read_only
innodb_max_purge_lag_wait_update(): Return immediately if we are
in high_level_read_only mode.

srv_wake_purge_thread_if_not_active(): Relax a debug assertion.
If srv_read_only_mode holds, purge_sys.enabled() will not hold
and this function will do nothing.

trx_t::commit_in_memory(): Remove a redundant condition before
invoking srv_wake_purge_thread_if_not_active().
2023-08-24 10:08:51 +03:00
Vlad Lesin
4c226c1850 MDEV-29050 mariabackup issues error messages during InnoDB tablespaces export on partial backup preparing
The solution is to suppress error messages for missing tablespaces if
mariabackup is launched with "--prepare --export" options.

"mariabackup --prepare --export" invokes itself with --mysqld parameter.
If the parameter is set, then it starts server to feed "FLUSH TABLES ...
FOR EXPORT;" queries for exported tablespaces. This is "normal" server
start, that's why new srv_operation value is introduced.

Reviewed by Marko Makela.
2023-03-27 20:15:10 +03:00
Marko Mäkelä
0c0a569028 Merge 10.3 into 10.4 2022-09-20 12:38:25 +03:00
Marko Mäkelä
c22dff21a5 InnoDB cleanup: Replace UNIV_LINUX, UNIV_SOLARIS, UNIV_AIX
Let us use the normal platform-specific preprocessor symbols
__linux__, __sun__, _AIX instead of some homebrew ones.

The preprocessor symbol UNIV_HPUX must have lost its meaning
by f6deb00a56 (note: the symbol
UNIV_HPUX10 is being checked for, but only UNIV_HPUX is defined).
2022-09-19 12:20:53 +03:00
Marko Mäkelä
e5c4f4e590 Merge 10.3 into 10.4 2022-07-27 14:25:36 +03:00
Marko Mäkelä
0ee1082bd2 MDEV-28495 InnoDB corruption due to lack of file locking
Starting with commit da094188f6 (MDEV-24393),
MariaDB will no longer acquire advisory file locks on InnoDB data
files by default, because it would create a large number of
entries in Linux /proc/locks.

The motivation for acquiring the file locks is to prevent accidental
concurrent startup of multiple server processes on the same data files.
Such mistake still turns out to be relatively common, based on
corruption bug reports from the community.

To prevent corruption due to concurrent startup attempts, the
Aria storage engine would unconditionally acquire an advisory lock
on one of its log files.

Solution: InnoDB will always lock its system tablespace files.
(Ever since commit 685d958e38
the InnoDB log file will not necessarily be open while the
server is running, because it can be accessed via memory-mapped I/O.)

If more protection is desired, then the option --external-locking
can be used.

The mandatory advisory lock also fixes intermittent failures of
some crash recovery tests. It turns out that when the mtr test harness
kills and restarts the server, it will not actually ensure that the
old process has terminated before starting the new one.
2022-07-27 14:15:14 +03:00
Marko Mäkelä
99c8aed00d MDEV-28601 InnoDB history list length was reverted to 32 bits
srv_do_purge(): In commit edde1f6e0d
when the de-facto 32-bit trx_sys_t::history_size() was replaced with
32-bit trx_sys.rseg_history_len, some more variables were changed
from ulint (size_t) to uint32_t.

The history list length is the number of committed transactions whose
undo logs are waiting to be purged. Each TRX_RSEG_HISTORY list is
storing the number of entries in a 32-bit field and each transaction
will occupy at least one undo log page. It is thinkable that the
length of each TRX_RSEG_HISTORY list may approach the maximum
representable number. The number cannot be exceeded, because the
rollback segment header is allocated from the same tablespace as
the undo log header pages it is pointing to, and because the page
numbers of a tablespace are stored in 32 bits. In any case, it is
possible that the total number of unpurged committed transactions
cannot be represented in 32 but 39 bits (corresponding to
128 rollback segments and undo tablespaces).
2022-05-25 14:06:04 +03:00
Sergei Golubchik
23ddc3518f Merge branch '10.3' into 10.4 2022-05-18 01:25:30 +02:00
Marko Mäkelä
3e564d468d MDEV-28541 Unused counter Innodb_encryption_key_rotation_list_length
The counter srv_stats.key_rotation_list_length is never updated, and
therefore Innodb_encryption_key_rotation_list_length will always be 0.

The view INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION comes close
to reporting this information.
2022-05-16 13:45:17 +03:00
Marko Mäkelä
4e1bf2bb23 MDEV-28537 Unused or useless InnoDB counters num_index_pages_written, num_non_index_pages_written
The counters were added in commit 5e55d1ced5
and any code to update them was
inadvertently removed in commit 2e814d4702
when applying InnoDB changes from MySQL 5.7.

Let us remove these counters that never reported anything useful. If such
statistics are really needed in a special case, they can be obtained by
instrumenting the code by some means, such as eBPF or a source code patch.
2022-05-16 13:41:53 +03:00
Marko Mäkelä
d172df9913 MDEV-25975: Merge 10.3 into 10.4 2022-04-06 09:18:38 +03:00
Marko Mäkelä
e9735a8185 MDEV-25975 innodb_disallow_writes causes shutdown to hang
We will remove the parameter innodb_disallow_writes because it is badly
designed and implemented. The parameter was never allowed at startup.
It was only internally used by Galera snapshot transfer.
If a user executed
SET GLOBAL innodb_disallow_writes=ON;
the server could hang even on subsequent read operations.

During Galera snapshot transfer, we will block writes
to implement an rsync friendly snapshot, as follows:

sst_flush_tables() will acquire a global lock by executing
FLUSH TABLES WITH READ LOCK, which will block any writes
at the high level.

sst_disable_innodb_writes(), invoked via ha_disable_internal_writes(true),
will suspend or disable InnoDB background tasks or threads that could
initiate writes. As part of this, log_make_checkpoint() will be invoked
to ensure that anything in the InnoDB buf_pool.flush_list will be written
to the data files. This has the nice side effect that the Galera joiner
will avoid crash recovery.

The changes to sql/wsrep.cc and to the tests are based on a prototype
that was developed by Jan Lindström.

Reviewed by: Jan Lindström
2022-04-06 08:06:49 +03:00
Oleksandr Byelkin
a576a1cea5 Merge branch '10.3' into 10.4 2022-01-30 09:46:52 +01:00
Oleksandr Byelkin
41a163ac5c Merge branch '10.2' into 10.3 2022-01-29 15:41:05 +01:00
Daniel Black
410c4edef3 MDEV-27467: innodb to enforce the minimum innodb_buffer_pool_size in SET GLOBAL
.. to be the same as startup.

In resolving MDEV-27461, BUF_LRU_MIN_LEN (256) is the minimum number of
pages for the innodb buffer pool size. Obviously we need more than just
flushing pages. Taking the 16k page size and its default minimum, an
extra 25% is needed on top of the flushing pages to make a workable buffer
pool.

The minimum innodb_buffer_pool_chunk_size (1M) restricts the minimum
otherwise we'd have a pool made up of different chunk sizes.

The resulting minimum innodb buffer pool sizes are:

Page Size, Previously minimum (startup), with change.
        4k                            5M           2M
        8k                            5M           3M
       16k                            5M           5M
       32k                           24M          10M
       64k                           24M          20M

With this patch, SET GLOBAL innodb_buffer_pool_size minimums are
enforced.

The evident minimum system variable size for innodb_buffer_pool_size
is 2M, however this is only setable if using 4k page size. As
the order of the page_size and buffer_pool_size aren't fixed, we can't
hide this change.

Subsequent changes:
* innodb_buffer_pool_resize_with_chunks.test - raised of pool resize due to new
  minimums. Chunk size also needed increase as the test was for
  pool_size < chunk_size to generate a warning.
* Removed srv_buf_pool_min_size and replaced use with MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Removed srv_buf_pool_def_size and replaced constant defination in
  MYSQL_SYSVAR_LONGLONG(buffer_pool_size)
* Reordered ha_innodb to allow for direct use of MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* Moved buf_pool_size_align into ha_innodb to access to MYSQL_SYSVAR_NAME(buffer_pool_size).min_val
* loose-innodb_disable_resize_buffer_pool_debug is needed in the
  innodb.restart.opt test so that under debug mode, resizing of the
  innodb buffer pool can occur.
2022-01-19 11:10:45 +11:00
Julius Goryavsky
681b7784b6 Merge branch 10.3 into 10.4 2021-12-25 12:13:03 +01:00
Julius Goryavsky
3376668ca8 Merge branch 10.2 into 10.3 2021-12-23 14:14:04 +01:00
Marko Mäkelä
ef9517eb81 MDEV-27268 Failed InnoDB initialization leaves garbage files behind
create_log_files(): Check log_set_capacity() before modifying
or creating any log files.

innobase_start_or_create_for_mysql(): If create_log_files()
fails and we were initializing a new database, delete the
system tablespace files before exiting.
2021-12-15 14:17:55 +02:00
sjaakola
5c230b21bf MDEV-23328 Server hang due to Galera lock conflict resolution
Mutex order violation when wsrep bf thread kills a conflicting trx,
the stack is

          wsrep_thd_LOCK()
          wsrep_kill_victim()
          lock_rec_other_has_conflicting()
          lock_clust_rec_read_check_and_lock()
          row_search_mvcc()
          ha_innobase::index_read()
          ha_innobase::rnd_pos()
          handler::ha_rnd_pos()
          handler::rnd_pos_by_record()
          handler::ha_rnd_pos_by_record()
          Rows_log_event::find_row()
          Update_rows_log_event::do_exec_row()
          Rows_log_event::do_apply_event()
          Log_event::apply_event()
          wsrep_apply_events()

and mutexes are taken in the order

          lock_sys->mutex -> victim_trx->mutex -> victim_thread->LOCK_thd_data

When a normal KILL statement is executed, the stack is

          innobase_kill_query()
          kill_handlerton()
          plugin_foreach_with_mask()
          ha_kill_query()
          THD::awake()
          kill_one_thread()

        and mutexes are

          victim_thread->LOCK_thd_data -> lock_sys->mutex -> victim_trx->mutex

This patch is the plan D variant for fixing potetial mutex locking
order exercised by BF aborting and KILL command execution.

In this approach, KILL command is replicated as TOI operation.
This guarantees total isolation for the KILL command execution
in the first node: there is no concurrent replication applying
and no concurrent DDL executing. Therefore there is no risk of
BF aborting to happen in parallel with KILL command execution
either. Potential mutex deadlocks between the different mutex
access paths with KILL command execution and BF aborting cannot
therefore happen.

TOI replication is used, in this approach,  purely as means
to provide isolated KILL command execution in the first node.
KILL command should not (and must not) be applied in secondary
nodes. In this patch, we make this sure by skipping KILL
execution in secondary nodes, in applying phase, where we
bail out if applier thread is trying to execute KILL command.
This is effective, but skipping the applying of KILL command
could happen much earlier as well.

This also fixed unprotected calls to wsrep_thd_abort
that will use wsrep_abort_transaction. This is fixed
by holding THD::LOCK_thd_data while we abort transaction.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-10-29 09:52:52 +03:00
mkaruza
2f5ae0da71 MDEV-25883 Galera Cluster hangs while "DELETE FROM mysql.wsrep_cluster"
Using `innodb_thread_concurrency` will call `wsrep_thd_is_aborting` to
check WSREP thread state. This call should be protected by taking
`LOCK_thd_data` before entering function.
Applier and TOI threads should no be affected with usage of
`innodb_thread_concurrency` variable so returning before any checks.

Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
2021-09-30 12:25:26 +03:00
Marko Mäkelä
d5bd704f4b Merge 10.3 into 10.4 2021-09-24 12:11:52 +03:00
Marko Mäkelä
4bfdba2e89 MDEV-26672 innodb_undo_log_truncate may reset transaction ID sequence
trx_rseg_header_create(): Add a parameter for the value that is
to be written to TRX_RSEG_MAX_TRX_ID. If we omit this write, then
the updated test innodb.undo_truncate will fail for the 4k, 8k, 16k
page sizes. This was broken ever since
commit 947efe17ed (MDEV-15158)
removed the writes of transaction identifiers to the TRX_SYS page.

srv_do_purge(): Truncate undo tablespaces also during slow shutdown
(innodb_fast_shutdown=0).

Thanks to Krunal Bauskar for noticing this problem.
2021-09-24 11:23:37 +03:00
Marko Mäkelä
baf0ef9a18 After-merge fix: Remove duplicated code
In the merge commit d3e4fae797
a message about innodb_force_recovery was accidentally duplicated.
2021-06-21 14:05:43 +03:00
Marko Mäkelä
d3e4fae797 Merge 10.3 into 10.4 2021-06-21 12:38:25 +03:00
Marko Mäkelä
e46f76c974 MDEV-15912: Remove traces of insert_undo
Let us simply refuse an upgrade from earlier versions if the
upgrade procedure was not followed. This simplifies the purge,
commit, and rollback of transactions.

Before upgrading to MariaDB 10.3 or later, a clean shutdown
of the server (with innodb_fast_shutdown=1 or 0) is necessary,
to ensure that any incomplete transactions are rolled back.
The undo log format was changed in MDEV-12288. There is only
one persistent undo log for each transaction.
2021-06-21 12:34:07 +03:00
Nikita Malyavin
509e4990af Merge branch bb-10.3-release into bb-10.4-release 2021-05-05 23:03:01 +03:00
Nikita Malyavin
a8a925dd22 Merge branch bb-10.2-release into bb-10.3-release 2021-05-04 14:49:31 +03:00
Nikita Malyavin
300253acf1 revive innodb_debug_sync
innodb_debug_sync was introduced in commit
b393e2cb0c and reverted in
commit fc58c17216 due to memory leak reported
by valgrind, see MDEV-21336.

The leak is now fixed by adding `rw_lock_free(&slot->debug_sync_lock)`
after background thread working loop is finished, and the patch is
reapplied, with respect to c++98 fixes by Marko.

The missing DEBUG_SYNC for MDEV-18546 in row0vers.cc is also reapplied.
2021-04-27 11:51:17 +03:00
Marko Mäkelä
5008171b05 Merge 10.3 into 10.4 2021-04-14 10:33:59 +03:00
Marko Mäkelä
450c017c2d Merge 10.2 into 10.3 2021-04-09 14:32:06 +03:00
Srinidhi Kaushik
5bc5ecce08 MDEV-24197: Add "innodb_force_recovery" for "mariabackup --prepare"
During the prepare phase of restoring backups, "mariabackup" does
not seem to allow (or recognize) the option "innodb_force_recovery"
for the embedded InnoDB server instance that it starts.

If page corruption observed during page recovery, the prepare step
fails. While this is indeed the correct behavior ideally, allowing
this option to be set in case of emergencies might be useful when
the current backup is the only copy available. Some error messages
during "--prepare" suggest to set "innodb_force_recovery" to 1:

  [ERROR] InnoDB: Set innodb_force_recovery=1 to ignore corruption.

For backwards compatibility, "mariabackup --innobackupex --apply-log"
should also have this option.

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
2021-04-01 13:34:40 +03:00
Marko Mäkelä
7ae37ff74f Merge 10.3 into 10.4 2021-03-27 17:12:28 +02:00
Marko Mäkelä
3157fa182a Merge 10.2 into 10.3 2021-03-27 16:11:26 +02:00
Marko Mäkelä
0f8caadc96 MDEV-22653: Remove the useless parameter innodb_simulate_comp_failures
The debug parameter innodb_simulate_comp_failures injected compression
failures for ROW_FORMAT=COMPRESSED tables, breaking the pre-existing
logic that I had implemented in the InnoDB Plugin for MySQL 5.1 to prevent
compressed page overflows. A much better check is already achieved by
defining UNIV_ZIP_COPY at the compilation time.
(Only UNIV_ZIP_DEBUG is part of cmake -DWITH_INNODB_EXTRA_DEBUG=ON.)
2021-03-22 18:12:44 +02:00
Sergei Golubchik
00a313ecf3 Merge branch 'bb-10.3-release' into bb-10.4-release
Note, the fix for "MDEV-23328 Server hang due to Galera lock conflict resolution"
was null-merged. 10.4 version of the fix is coming up separately
2021-02-12 17:44:22 +01:00
Sergei Golubchik
60ea09eae6 Merge branch '10.2' into 10.3 2021-02-01 13:49:33 +01:00
Marko Mäkelä
59eda73eff MDEV-24751: member call on fil_system.temp_space in innodb_shutdown()
innodb_shutdown(): Check that fil_system.temp_space is not null
before invoking a member function.

This regression was caused by the merge
commit fa1aef39eb
of MDEV-24340 (commit 1eb59c307d).
2021-02-01 13:17:17 +02:00
Marko Mäkelä
ea9cd97f85 MDEV-24536 innodb_idle_flush_pct has no effect
The parameter innodb_idle_flush_pct that was introduced in
MariaDB Server 10.1.2 by MDEV-6932 has no effect ever since
the InnoDB changes from MySQL 5.7.9 were applied in
commit 2e814d4702.

Let us declare the parameter as deprecated and having no effect.
2021-01-13 18:55:56 +02:00
Marko Mäkelä
0aa02567dd Merge 10.3 into 10.4 2020-12-23 14:52:59 +02:00
Marko Mäkelä
fa1aef39eb Merge 10.2 into 10.3 2020-12-23 14:25:45 +02:00
Marko Mäkelä
1eb59c307d MDEV-24340 Unique final message of InnoDB during shutdown
innobase_space_shutdown(): Remove. We want this step to be executed
before the message "InnoDB: Shutdown completed; log sequence number "
is output by innodb_shutdown(). It used to be executed after that step.

innodb_shutdown(): Duplicate the code that used to live in
innobase_space_shutdown().

innobase_init_abort(): Merge with innobase_space_shutdown().
2020-12-04 11:46:47 +02:00
Eugene Kosov
a50cb4867a MDEV-24334 make monitor_set_tbl global variable thread-safe
Atomic_relaxed<T>: add fetch_or() and fetch_and()

innodb_init(): rely on a zero-initialization of a global variable

monitor_set_tbl: make Atomic_relaxed<ulint> array and use proper operations
for setting bit, unsetting bit and reading bit

Reviewed by: Marko Mäkelä
2020-12-03 11:55:36 +03:00
Marko Mäkelä
2fa9f8c53a Merge 10.3 into 10.4 2020-08-20 11:01:47 +03:00
Marko Mäkelä
de0e7cd72a Merge 10.2 into 10.3 2020-08-20 09:12:16 +03:00
Marko Mäkelä
309302a3da MDEV-23475 InnoDB performance regression for write-heavy workloads
In commit fe39d02f51 (MDEV-20638)
we removed some wake-up signaling of the master thread that should
have been there, to ensure a steady log checkpointing workload.

Common sense suggests that the commit omitted some necessary calls
to srv_inc_activity_count(). But, an attempt to add the call to
trx_flush_log_if_needed_low() as well as to reinstate the function
innobase_active_small() did not restore the performance for the
case where sync_binlog=1 is set.

Therefore, we will revert the entire commit in MariaDB Server 10.2.
In MariaDB Server 10.5, adding a srv_inc_activity_count() call to
trx_flush_log_if_needed_low() did restore the performance, so we
will not revert MDEV-20638 across all versions.
2020-08-19 11:18:56 +03:00
Marko Mäkelä
9216114ce7 Merge 10.3 into 10.4 2020-07-31 18:09:08 +03:00
Marko Mäkelä
66ec3a770f Merge 10.2 into 10.3 2020-07-31 13:51:28 +03:00
sjaakola
95132ade6d MDEV-20928 mtr test galera.galera_var_innodb_disallow_writes test failure
The sporadic test hangs happen because of mutex dealock between innodb
background threads and two test connection executions.
The test sets variable innodb_disallow_writes, which blocks all writes
to filesyste. The test logic is to execute an INSERT, which should hang
because of filesytstem writes are blocked, and through another session
verify by SELECT that this hanging happens. The SELECT session will then
release innodb_disallow_writes blocking.

However, filesystem write  blocking affects also innodb background threads
and they may hang while keeping some other resources locked.
As an example, in one test hang situation, buffer pool access was blocked.
And, if buffer pool is blocked, the test connections will be blocked as well,
and the SELECT session will not be able to continue to release the
innodb_disallow_writes.

The fix in this commit is refactoring of the test logic.
The test will now set first innodb_disallow_writes blocking, and then record
a hash of data directory's filesystem contents. This works as checksum of the
state of data on the datadirectory.

Then some SQL load is tried on both nodes, these sessions will be blocking
due to frozen file system state. The test will have a short sleep to allow
innodb background threads to loop and possibly encounter innodb_disallow_writes
blocking as well.

After the sleep, the test will record file system checksun for the second time,
and then release the innodb_disallow-writes blocking.

Finally, the two checksums are compared, they should be identical to verify that
nothing was written on datadirectory during the test execution.

The checksum is implemented by md5sum hash over all files found in datadirectory
by find command. all these file hashes are hashed together by one more md5sum.

The test therefore depends on md5sum and find. find may work differently with some
OS distributions, e.g. freebsd may be problematic.
2020-07-24 12:05:39 +03:00
Thirunarayanan Balathandayuthapani
fe39d02f51 MDEV-20638 Remove the deadcode from srv_master_thread() and srv_active_wake_master_thread_low()
- Due to commit fe95cb2e40 (MDEV-16125),
InnoDB master thread does not need to call srv_resume_thread()
and therefore there is no need to wake up the thread.
Due to the above patch, InnoDB should remove the following dead code.

srv_check_activity(): Makes the parameter as in,out and returns the
recent activity value

innobase_active_small(): Removed

srv_active_wake_master_thread(): Removed

srv_wake_master_thread(): Removed

srv_active_wake_master_thread_low(): Removed

Simplify srv_master_thread() and remove switch cases, added the assert.

Replace srv_wake_master_thread() with srv_inc_activity_count()

INNOBASE_WAKE_INTERVAL: Removed
2020-07-23 16:23:20 +05:30