Commit graph

130 commits

Author SHA1 Message Date
Marko Mäkelä
7b51d11cca MDEV-25594: Improve debug checks
trx_t::will_lock: Changed the type to bool.

trx_t::is_autocommit_non_locking(): Replaces
trx_is_autocommit_non_locking().

trx_is_ac_nl_ro(): Remove (replaced with equivalent assertion expressions).

assert_trx_nonlocking_or_in_list(): Remove.
Replaced with at least as strict checks in each place.

check_trx_state(): Moved to a static function; partially replaced with
individual debug assertions implementing equivalent or stricter checks.
2021-05-18 09:27:59 +03:00
Marko Mäkelä
80ed136e6d Merge 10.4 into 10.5 2021-04-21 09:01:01 +03:00
Marko Mäkelä
a0588d54a2 Merge 10.3 into 10.4 2021-04-21 07:58:42 +03:00
Marko Mäkelä
75c01f39b1 Merge 10.2 into 10.3 2021-04-21 07:25:48 +03:00
Marko Mäkelä
be881ec457 Merge 10.4 into 10.5 2021-03-19 13:09:21 +02:00
Marko Mäkelä
961c7938bb Merge 10.4 into 10.5 2021-01-25 12:44:24 +02:00
Marko Mäkelä
3467f63764 Merge 10.3 into 10.4 2021-01-25 11:02:07 +02:00
Marko Mäkelä
0e10d7ea14 MDEV-22351 InnoDB may recover wrong information after RESET MASTER
Ever since commit 947efe17ed
InnoDB no longer writes binlog position in one place.
It will not at all be written to the TRX_SYS page, and
instead it will be written to the undo log header page that
changes the transaction state.

trx_rseg_mem_restore(): Recover the information from the latest
written page.
2021-01-22 16:44:17 +02:00
Marko Mäkelä
7b2bb67113 Merge 10.3 into 10.4 2020-10-29 13:38:38 +02:00
Marko Mäkelä
a8de8f261d Merge 10.2 into 10.3 2020-10-28 10:01:50 +02:00
Marko Mäkelä
45ed9dd957 MDEV-23855: Remove fil_system.LRU and reduce fil_system.mutex contention
Also fixes MDEV-23929: innodb_flush_neighbors is not being ignored
for system tablespace on SSD

When the maximum configured number of file is exceeded, InnoDB will
close data files. We used to maintain a fil_system.LRU list and
a counter fil_node_t::n_pending to achieve this, at the huge cost
of multiple fil_system.mutex operations per I/O operation.

fil_node_open_file_low(): Implement a FIFO replacement policy:
The last opened file will be moved to the end of fil_system.space_list,
and files will be closed from the start of the list. However, we will
not move tablespaces in fil_system.space_list while
i_s_tablespaces_encryption_fill_table() is executing
(producing output for INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION)
because it may cause information of some tablespaces to go missing.
We also avoid this in mariabackup --backup because datafiles_iter_next()
assumes that the ordering is not changed.

IORequest: Fold more parameters to IORequest::type.

fil_space_t::io(): Replaces fil_io().

fil_space_t::flush(): Replaces fil_flush().

OS_AIO_IBUF: Remove. We will always issue synchronous reads of the
change buffer pages in buf_read_page_low().

We will always ignore some errors for background reads.

This should reduce fil_system.mutex contention a little.

fil_node_t::complete_write(): Replaces fil_node_t::complete_io().
On both read and write completion, fil_space_t::release_for_io()
will have to be called.

fil_space_t::io(): Do not acquire fil_system.mutex in the normal
code path.

xb_delta_open_matching_space(): Do not try to open the system tablespace
which was already opened. This fixes a file sharing violation in
mariabackup --prepare --incremental.

Reviewed by: Vladislav Vaintroub
2020-10-26 17:09:01 +02:00
Marko Mäkelä
7cffb5f6e8 MDEV-23399: Performance regression with write workloads
The buffer pool refactoring in MDEV-15053 and MDEV-22871 shifted
the performance bottleneck to the page flushing.

The configuration parameters will be changed as follows:

innodb_lru_flush_size=32 (new: how many pages to flush on LRU eviction)
innodb_lru_scan_depth=1536 (old: 1024)
innodb_max_dirty_pages_pct=90 (old: 75)
innodb_max_dirty_pages_pct_lwm=75 (old: 0)

Note: The parameter innodb_lru_scan_depth will only affect LRU
eviction of buffer pool pages when a new page is being allocated. The
page cleaner thread will no longer evict any pages. It used to
guarantee that some pages will remain free in the buffer pool. Now, we
perform that eviction 'on demand' in buf_LRU_get_free_block().
The parameter innodb_lru_scan_depth(srv_LRU_scan_depth) is used as follows:
 * When the buffer pool is being shrunk in buf_pool_t::withdraw_blocks()
 * As a buf_pool.free limit in buf_LRU_list_batch() for terminating
   the flushing that is initiated e.g., by buf_LRU_get_free_block()
The parameter also used to serve as an initial limit for unzip_LRU
eviction (evicting uncompressed page frames while retaining
ROW_FORMAT=COMPRESSED pages), but now we will use a hard-coded limit
of 100 or unlimited for invoking buf_LRU_scan_and_free_block().

The status variables will be changed as follows:

innodb_buffer_pool_pages_flushed: This includes also the count of
innodb_buffer_pool_pages_LRU_flushed and should work reliably,
updated one by one in buf_flush_page() to give more real-time
statistics. The function buf_flush_stats(), which we are removing,
was not called in every code path. For both counters, we will use
regular variables that are incremented in a critical section of
buf_pool.mutex. Note that show_innodb_vars() directly links to the
variables, and reads of the counters will *not* be protected by
buf_pool.mutex, so you cannot get a consistent snapshot of both variables.

The following INFORMATION_SCHEMA.INNODB_METRICS counters will be
removed, because the page cleaner no longer deals with writing or
evicting least recently used pages, and because the single-page writes
have been removed:
* buffer_LRU_batch_flush_avg_time_slot
* buffer_LRU_batch_flush_avg_time_thread
* buffer_LRU_batch_flush_avg_time_est
* buffer_LRU_batch_flush_avg_pass
* buffer_LRU_single_flush_scanned
* buffer_LRU_single_flush_num_scan
* buffer_LRU_single_flush_scanned_per_call

When moving to a single buffer pool instance in MDEV-15058, we missed
some opportunity to simplify the buf_flush_page_cleaner thread. It was
unnecessarily using a mutex and some complex data structures, even
though we always have a single page cleaner thread.

Furthermore, the buf_flush_page_cleaner thread had separate 'recovery'
and 'shutdown' modes where it was waiting to be triggered by some
other thread, adding unnecessary latency and potential for hangs in
relatively rarely executed startup or shutdown code.

The page cleaner was also running two kinds of batches in an
interleaved fashion: "LRU flush" (writing out some least recently used
pages and evicting them on write completion) and the normal batches
that aim to increase the MIN(oldest_modification) in the buffer pool,
to help the log checkpoint advance.

The buf_pool.flush_list flushing was being blocked by
buf_block_t::lock for no good reason. Furthermore, if the FIL_PAGE_LSN
of a page is ahead of log_sys.get_flushed_lsn(), that is, what has
been persistently written to the redo log, we would trigger a log
flush and then resume the page flushing. This would unnecessarily
limit the performance of the page cleaner thread and trigger the
infamous messages "InnoDB: page_cleaner: 1000ms intended loop took 4450ms.
The settings might not be optimal" that were suppressed in
commit d1ab89037a unless log_warnings>2.

Our revised algorithm will make log_sys.get_flushed_lsn() advance at
the start of buf_flush_lists(), and then execute a 'best effort' to
write out all pages. The flush batches will skip pages that were modified
since the log was written, or are are currently exclusively locked.
The MDEV-13670 message "page_cleaner: 1000ms intended loop took" message
will be removed, because by design, the buf_flush_page_cleaner() should
not be blocked during a batch for extended periods of time.

We will remove the single-page flushing altogether. Related to this,
the debug parameter innodb_doublewrite_batch_size will be removed,
because all of the doublewrite buffer will be used for flushing
batches. If a page needs to be evicted from the buffer pool and all
100 least recently used pages in the buffer pool have unflushed
changes, buf_LRU_get_free_block() will execute buf_flush_lists() to
write out and evict innodb_lru_flush_size pages. At most one thread
will execute buf_flush_lists() in buf_LRU_get_free_block(); other
threads will wait for that LRU flushing batch to finish.

To improve concurrency, we will replace the InnoDB ib_mutex_t and
os_event_t native mutexes and condition variables in this area of code.
Most notably, this means that the buffer pool mutex (buf_pool.mutex)
is no longer instrumented via any InnoDB interfaces. It will continue
to be instrumented via PERFORMANCE_SCHEMA.

For now, both buf_pool.flush_list_mutex and buf_pool.mutex will be
declared with MY_MUTEX_INIT_FAST (PTHREAD_MUTEX_ADAPTIVE_NP). The critical
sections of buf_pool.flush_list_mutex should be shorter than those for
buf_pool.mutex, because in the worst case, they cover a linear scan of
buf_pool.flush_list, while the worst case of a critical section of
buf_pool.mutex covers a linear scan of the potentially much longer
buf_pool.LRU list.

mysql_mutex_is_owner(), safe_mutex_is_owner(): New predicate, usable
with SAFE_MUTEX. Some InnoDB debug assertions need this predicate
instead of mysql_mutex_assert_owner() or mysql_mutex_assert_not_owner().

buf_pool_t::n_flush_LRU, buf_pool_t::n_flush_list:
Replaces buf_pool_t::init_flush[] and buf_pool_t::n_flush[].
The number of active flush operations.

buf_pool_t::mutex, buf_pool_t::flush_list_mutex: Use mysql_mutex_t
instead of ib_mutex_t, to have native mutexes with PERFORMANCE_SCHEMA
and SAFE_MUTEX instrumentation.

buf_pool_t::done_flush_LRU: Condition variable for !n_flush_LRU.

buf_pool_t::done_flush_list: Condition variable for !n_flush_list.

buf_pool_t::do_flush_list: Condition variable to wake up the
buf_flush_page_cleaner when a log checkpoint needs to be written
or the server is being shut down. Replaces buf_flush_event.
We will keep using timed waits (the page cleaner thread will wake
_at least_ once per second), because the calculations for
innodb_adaptive_flushing depend on fixed time intervals.

buf_dblwr: Allocate statically, and move all code to member functions.
Use a native mutex and condition variable. Remove code to deal with
single-page flushing.

buf_dblwr_check_block(): Make the check debug-only. We were spending
a significant amount of execution time in page_simple_validate_new().

flush_counters_t::unzip_LRU_evicted: Remove.

IORequest: Make more members const. FIXME: m_fil_node should be removed.

buf_flush_sync_lsn: Protect by std::atomic, not page_cleaner.mutex
(which we are removing).

page_cleaner_slot_t, page_cleaner_t: Remove many redundant members.

pc_request_flush_slot(): Replaces pc_request() and pc_flush_slot().

recv_writer_thread: Remove. Recovery works just fine without it, if we
simply invoke buf_flush_sync() at the end of each batch in
recv_sys_t::apply().

recv_recovery_from_checkpoint_finish(): Remove. We can simply call
recv_sys.debug_free() directly.

srv_started_redo: Replaces srv_start_state.

SRV_SHUTDOWN_FLUSH_PHASE: Remove. logs_empty_and_mark_files_at_shutdown()
can communicate with the normal page cleaner loop via the new function
flush_buffer_pool().

buf_flush_remove(): Assert that the calling thread is holding
buf_pool.flush_list_mutex. This removes unnecessary mutex operations
from buf_flush_remove_pages() and buf_flush_dirty_pages(),
which replace buf_LRU_flush_or_remove_pages().

buf_flush_lists(): Renamed from buf_flush_batch(), with simplified
interface. Return the number of flushed pages. Clarified comments and
renamed min_n to max_n. Identify LRU batch by lsn=0. Merge all the functions
buf_flush_start(), buf_flush_batch(), buf_flush_end() directly to this
function, which was their only caller, and remove 2 unnecessary
buf_pool.mutex release/re-acquisition that we used to perform around
the buf_flush_batch() call. At the start, if not all log has been
durably written, wait for a background task to do it, or start a new
task to do it. This allows the log write to run concurrently with our
page flushing batch. Any pages that were skipped due to too recent
FIL_PAGE_LSN or due to them being latched by a writer should be flushed
during the next batch, unless there are further modifications to those
pages. It is possible that a page that we must flush due to small
oldest_modification also carries a recent FIL_PAGE_LSN or is being
constantly modified. In the worst case, all writers would then end up
waiting in log_free_check() to allow the flushing and the checkpoint
to complete.

buf_do_flush_list_batch(): Clarify comments, and rename min_n to max_n.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_flush_space(): Auxiliary function to look up a tablespace for
page flushing.

buf_flush_page(): Defer the computation of space->full_crc32(). Never
call log_write_up_to(), but instead skip persistent pages whose latest
modification (FIL_PAGE_LSN) is newer than the redo log. Also skip
pages on which we cannot acquire a shared latch without waiting.

buf_flush_try_neighbors(): Do not bother checking buf_fix_count
because buf_flush_page() will no longer wait for the page latch.
Take the tablespace as a parameter, and only execute this function
when innodb_flush_neighbors>0. Avoid repeated calls of page_id_t::fold().

buf_flush_relocate_on_flush_list(): Declare as cold, and push down
a condition from the callers.

buf_flush_check_neighbor(): Take id.fold() as a parameter.

buf_flush_sync(): Ensure that the buf_pool.flush_list is empty,
because the flushing batch will skip pages whose modifications have
not yet been written to the log or were latched for modification.

buf_free_from_unzip_LRU_list_batch(): Remove redundant local variables.

buf_flush_LRU_list_batch(): Let the caller buf_do_LRU_batch() initialize
the counters, and report n->evicted.
Cache the last looked up tablespace. If neighbor flushing is not applicable,
invoke buf_flush_page() directly, avoiding a page lookup in between.

buf_do_LRU_batch(): Return the number of pages flushed.

buf_LRU_free_page(): Only release and re-acquire buf_pool.mutex if
adaptive hash index entries are pointing to the block.

buf_LRU_get_free_block(): Do not wake up the page cleaner, because it
will no longer perform any useful work for us, and we do not want it
to compete for I/O while buf_flush_lists(innodb_lru_flush_size, 0)
writes out and evicts at most innodb_lru_flush_size pages. (The
function buf_do_LRU_batch() may complete after writing fewer pages if
more than innodb_lru_scan_depth pages end up in buf_pool.free list.)
Eliminate some mutex release-acquire cycles, and wait for the LRU
flush batch to complete before rescanning.

buf_LRU_check_size_of_non_data_objects(): Simplify the code.

buf_page_write_complete(): Remove the parameter evict, and always
evict pages that were part of an LRU flush.

buf_page_create(): Take a pre-allocated page as a parameter.

buf_pool_t::free_block(): Free a pre-allocated block.

recv_sys_t::recover_low(), recv_sys_t::apply(): Preallocate the block
while not holding recv_sys.mutex. During page allocation, we may
initiate a page flush, which in turn may initiate a log flush, which
would require acquiring log_sys.mutex, which should always be acquired
before recv_sys.mutex in order to avoid deadlocks. Therefore, we must
not be holding recv_sys.mutex while allocating a buffer pool block.

BtrBulk::logFreeCheck(): Skip a redundant condition.

row_undo_step(): Do not invoke srv_inc_activity_count() for every row
that is being rolled back. It should suffice to invoke the function in
trx_flush_log_if_needed() during trx_t::commit_in_memory() when the
rollback completes.

sync_check_enable(): Remove. We will enable innodb_sync_debug from the
very beginning.

Reviewed by: Vladislav Vaintroub
2020-10-15 17:04:56 +03:00
Marko Mäkelä
0b4ed0b7fb Merge 10.4 into 10.5 2020-08-21 20:32:04 +03:00
Marko Mäkelä
aa6cb7ed03 Merge 10.3 into 10.4 2020-08-21 19:57:22 +03:00
Marko Mäkelä
c277bcd591 Merge 10.2 into 10.3 2020-08-21 19:18:34 +03:00
Eugene Kosov
e9c389c334 MDEV-22701 InnoDB: encapsulate trx_sys.mutex and trx_sys.trx_list into a separate class
thread_safe_trx_ilist_t: almost generic one

UT_LIST was replaced with ilist<t>

innobase_kill_query: wrong comment removed.
2020-06-23 19:11:57 +03:00
Sergey Vojtovich
ccdfcedf10 MDEV-22693 - InnoDB: get rid of casts for rw_trx_hash iterator
Less error prone, stricter type control.
2020-05-30 10:24:11 +04:00
Sergey Vojtovich
03dcdad251 MDEV-22697 - InnoDB: remove trx->no
trx->no is duplicated by trx->rw_trx_hash_element->no for no good reason.
The latter is preferred for performance reasons: allows to avoid taking
trx->rw_trx_hash_element->mutex when snapshotting.
2020-05-26 17:11:31 +04:00
Sergey Vojtovich
50b0ce44a2 MDEV-22593 - InnoDB: don't take trx_sys.mutex in ReadView::open()
This was the last abuse of trx_sys.mutex, which is now exclusively
protecting trx_sys.trx_list.

This global acquisition was also potential scalability bottleneck for
oltp_read_write benchmark. Although it didn't expose itself as such due
to larger scalability issues.

Replaced trx_sys.mutex based synchronisation between ReadView creator
thread and purge coordinator thread performing latest view clone with
ReadView::m_mutex.

It also allowed to simplify tri-state view m_state down to boolean
m_open flag.

For performance reasons trx->read_view.close() is left as atomic relaxed
store, so that we don't have to waste resources for otherwise meaningless
mutex acquisition.
2020-05-26 17:11:20 +04:00
Marko Mäkelä
3826178da8 Fix the Windows non-debug build
warning C4390:  ';': empty controlled statement found; is this the intent?
2019-11-28 19:53:00 +02:00
Marko Mäkelä
4beace3316 MDEV-21171 InnoDB is unnecessarily resetting FIL_PAGE_TYPE for full_crc32 files
Before commit c0f47a4a58 (MDEV-12026)
introduced the innodb_checksum_algorithm=full_crc32 format,
it was impossible to tell if InnoDB data files contained garbage in
the FIL_PAGE_TYPE header field (and possibly other fields).
This is because before commit 3926673ce7
in MySQL 5.1.48, InnoDB would write uninitialized data to some fields,
and because there was no way to tell with which InnoDB version a data
file was created.

If fil_space_t::full_crc32() holds, the data file cannot contain
uninitialized garbage or invalid FIL_PAGE_TYPE, and thus
fil_block_check_type() should not be invoked to correct anything.
2019-11-28 16:22:53 +02:00
Marko Mäkelä
2b7aa60b7e Use constexpr for constants on data pages 2019-11-13 14:34:52 +02:00
Sergei Golubchik
244f0e6dd8 Merge branch '10.3' into 10.4 2019-09-06 11:53:10 +02:00
Marko Mäkelä
2c9e75ccfe MDEV-15326 after-merge fixes
trx_t::is_recovered: Revert most of the changes that were made by the
merge of MDEV-15326 from 10.2. The trx_sys.rw_trx_hash and the recovery
of transactions at startup is quite different in 10.3.

trx_free_at_shutdown(): Avoid excessive mutex protection. Reading fields
that can only be modified by the current thread (owning the transaction)
can be done outside mutex.

trx_t::commit_state(): Restore a tighter assertion.

trx_rollback_recovered(): Clarify why there is no potential race condition
with other transactions.

lock_trx_release_locks(): Merge with trx_t::release_locks(),
and avoid holding lock_sys.mutex unnecessarily long.

rw_trx_hash_t::find(): Remove redundant code, and avoid starving the
committer by checking trx_t::state before trx_t::reference().
2019-09-05 15:58:31 +03:00
Marko Mäkelä
537f8594a6 Merge 10.2 into 10.3 2019-09-04 17:52:04 +03:00
Marko Mäkelä
b07beff894 MDEV-15326: InnoDB: Failing assertion: !other_lock
MySQL 5.7.9 (and MariaDB 10.2.2) introduced a race condition
between InnoDB transaction commit and the conversion of implicit
locks into explicit ones.

The assertion failure can be triggered with a test that runs
3 concurrent single-statement transactions in a loop on a simple
table:

CREATE TABLE t (a INT PRIMARY KEY) ENGINE=InnoDB;
thread1: INSERT INTO t SET a=1;
thread2: DELETE FROM t;
thread3: SELECT * FROM t FOR UPDATE; -- or DELETE FROM t;

The failure scenarios are like the following:
(1) The INSERT statement is being committed, waiting for lock_sys->mutex.
(2) At the time of the failure, both the DELETE and SELECT transactions
are active but have not logged any changes yet.
(3) The transaction where the !other_lock assertion fails started
lock_rec_convert_impl_to_expl().
(4) After this point, the commit of the INSERT removed the transaction from
trx_sys->rw_trx_set, in trx_erase_lists().
(5) The other transaction consulted trx_sys->rw_trx_set and determined
that there is no implicit lock. Hence, it grabbed the lock.
(6) The !other_lock assertion fails in lock_rec_add_to_queue()
for the lock_rec_convert_impl_to_expl(), because the lock was 'stolen'.
This assertion failure looks genuine, because the INSERT transaction
is still active (trx->state=TRX_STATE_ACTIVE).

The problematic step (4) was introduced in
mysql/mysql-server@e27e0e0bb7
which fixed something related to MVCC (covered by the test
innodb.innodb-read-view). Basically, it reintroduced an error
that had been mentioned in an earlier commit
mysql/mysql-server@a17be6963f:
"The active transaction was removed from trx_sys->rw_trx_set prematurely."

Our fix goes along the following lines:

(a) Implicit locks will released by assigning
trx->state=TRX_STATE_COMMITTED_IN_MEMORY as the first step.
This transition will no longer be protected by lock_sys_t::mutex,
only by trx->mutex. This idea is by Sergey Vojtovich.
(b) We detach the transaction from trx_sys before starting to release
explicit locks.
(c) All callers of trx_rw_is_active() and trx_rw_is_active_low() must
recheck trx->state after acquiring trx->mutex.
(d) Before releasing any explicit locks, we will ensure that any activity
by other threads to convert implicit locks into explicit will have ceased,
by checking !trx_is_referenced(trx). There was a glitch
in this check when it was part of lock_trx_release_locks(); at the end
we would release trx->mutex and acquire lock_sys->mutex and trx->mutex,
and fail to recheck (trx_is_referenced() is protected by trx_t::mutex).
(e) Explicit locks can be released in batches (LOCK_RELEASE_INTERVAL=1000)
just like we did before.

trx_t::state: Document that the transition to COMMITTED is only
protected by trx_t::mutex, no longer by lock_sys_t::mutex.

trx_rw_is_active_low(), trx_rw_is_active(): Document that the transaction
state should be rechecked after acquiring trx_t::mutex.

trx_t::commit_state(): New function to change a transaction to committed
state, to release implicit locks.

trx_t::release_locks(): New function to release the explicit locks
after commit_state().

lock_trx_release_locks(): Move much of the logic to the caller
(which must invoke trx_t::commit_state() and trx_t::release_locks()
as needed), and assert that the transaction will have locks.

trx_get_trx_by_xid(): Make the parameter a pointer to const.

lock_rec_other_trx_holds_expl(): Recheck trx->state after acquiring
trx->mutex, and avoid a redundant lookup of the transaction.

lock_rec_queue_validate(): Recheck impl_trx->state while holding
impl_trx->mutex.

row_vers_impl_x_locked(), row_vers_impl_x_locked_low():
Document that the transaction state must be rechecked after
trx_mutex_enter().

trx_free_prepared(): Adjust for the changes to lock_trx_release_locks().
2019-09-04 09:42:38 +03:00
Marko Mäkelä
7c79c12784 MDEV-15326 preparation: Remove trx_sys_t::n_prepared_trx
This is a backport of 900b07908b
from MariaDB Server 10.3.
2019-09-04 09:42:38 +03:00
Marko Mäkelä
db4a27ab73 Merge 10.3 into 10.4 2019-08-31 06:53:45 +03:00
Marko Mäkelä
d4246e25e5 After-merge fix
Merge a part of commit 25af2a183b.
This was forgotten in the
merge commit e41eb044f1.
2019-08-29 18:22:41 +03:00
Oleksandr Byelkin
c07325f932 Merge branch '10.3' into 10.4 2019-05-19 20:55:37 +02:00
Marko Mäkelä
be85d3e61b Merge 10.2 into 10.3 2019-05-14 17:18:46 +03:00
Marko Mäkelä
26a14ee130 Merge 10.1 into 10.2 2019-05-13 17:54:04 +03:00
Vicențiu Ciorbaru
c0ac0b8860 Update FSF address 2019-05-11 19:25:02 +03:00
Marko Mäkelä
e6bdf77e4b Merge 10.3 into 10.4
In is_eits_usable(), we disable an assertion that fails due to
MDEV-19334.
2019-04-25 16:05:20 +03:00
Marko Mäkelä
acf6f92aa9 Merge 10.2 into 10.3 2019-04-25 09:05:52 +03:00
Marko Mäkelä
0a1c3477bf MDEV-18493 Remove page_size_t
MySQL 5.7 introduced the class page_size_t and increased the size of
buffer pool page descriptors by introducing this object to them.

Maybe the intention of this exercise was to prepare for a future
where the buffer pool could accommodate multiple page sizes.
But that future never arrived, not even in MySQL 8.0. It is much
easier to manage a pool of a single page size, and typically all
storage devices of an InnoDB instance benefit from using the same
page size.

Let us remove page_size_t from MariaDB Server. This will make it
easier to remove support for ROW_FORMAT=COMPRESSED (or make it a
compile-time option) in the future, just by removing various
occurrences of zip_size.
2019-02-07 12:21:35 +02:00
Sergey Vojtovich
dc90234bda MDEV-17441 - InnoDB transition to C++11 atomics
Added lf_hash_size() macro, so that callers don't need to use atomic
operations.
2018-12-28 13:24:11 +04:00
Sergey Vojtovich
edde1f6e0d MDEV-17441 - InnoDB transition to C++11 atomics
trx_sys_t::rseg_history_len transition to Atomic_counter.
2018-12-27 22:46:38 +04:00
Sergey Vojtovich
4abdd798f7 MDEV-17441 - InnoDB transition to C++11 atomics
rw_trx_hash_element_t::no transition to Atomic_counter. Since C++11
doesn't seem to allow mixed (atomic and non-atomic) access to atomic
variables, we have to perform atomic initialisation.
2018-12-27 22:46:38 +04:00
Sergey Vojtovich
8023fc6d45 MDEV-17441 - InnoDB transition to C++11 atomics
Almost trivial trx_sys_t::m_rw_trx_hash_version transition. Since C++11
doesn't seem to allow mixed (atomic and non-atomic) access to atomic
variables, we have to perform atomic initialisation.
2018-12-27 22:46:38 +04:00
Sergey Vojtovich
90377b8028 MDEV-17441 - InnoDB transition to C++11 atomics
trx_sys_t::m_max_trx_id transition to Atomic_counter. Since C++11 doesn't
seem to allow mixed (atomic and non-atomic) access to atomic variables,
we have to perform atomic initialisation.
2018-12-27 22:46:38 +04:00
Marko Mäkelä
0abd2766b1 Merge 10.2 into 10.3
Also, related to MDEV-15522, MDEV-17304, MDEV-17835,
remove the Galera xtrabackup tests, because xtrabackup never worked
with MariaDB Server 10.3 due to InnoDB redo log format changes.
2018-11-30 09:38:56 +02:00
Marko Mäkelä
447e493179 Remove some unnecessary InnoDB #include 2018-11-29 12:53:44 +02:00
Marko Mäkelä
df563e0c03 Merge 10.2 into 10.3
main.derived_cond_pushdown: Move all 10.3 tests to the end,
trim trailing white space, and add an "End of 10.3 tests" marker.
Add --sorted_result to tests where the ordering is not deterministic.

main.win_percentile: Add --sorted_result to tests where the
ordering is no longer deterministic.
2018-11-06 09:40:39 +02:00
Eugene Kosov
14be814380 MDEV-17491 micro optimize page_id_t
page_id_t: remove m_fold member

various places: pass page_id_t by value instead of by reference
2018-10-25 18:46:27 +03:00
Marko Mäkelä
1eb2d8f6e8 Merge 10.2 into 10.3 2018-08-16 08:54:58 +03:00
Marko Mäkelä
1748a31ae8 MDEV-16675 Unnecessary explicit lock acquisition during UPDATE or DELETE
In InnoDB, an INSERT will not create an explicit lock object. Instead,
the inserted record is initially implicitly locked by the transaction
that wrote its trx_t::id to the hidden system column DB_TRX_ID.
(Other transactions would check if DB_TRX_ID is referring to a
transaction that has not been committed.)

If a record was inserted in the current transaction, it would be
implicitly locked by that transaction. Only if some other transaction
is requesting access to the record, the implicit lock should be
converted to an explicit one, so that the waits-for graph can be
constructed for detecting deadlocks and lock wait timeouts.

Before this fix, InnoDB would convert implicit locks to
explicit ones, even if no conflict exists.

lock_rec_convert_impl_to_expl(): Return whether caller_trx
already holds an explicit lock that covers the record.

row_vers_impl_x_locked_low(): Avoid a lookup if the record matches
caller_trx->id.

lock_trx_has_expl_x_lock(): Renamed from lock_trx_has_rec_x_lock().

row_upd_clust_step(): In a debug assertion, check for implicit lock
before invoking lock_trx_has_expl_x_lock().

rw_trx_hash_t::find(): Make do_ref_count a mandatory parameter.
Assert that trx_id is not 0 (the caller should check it).

trx_sys_t::is_registered(): Only invoke find() if id != 0.

trx_sys_t::find(): Add the optional parameter do_ref_count.

lock_rec_queue_validate(): Avoid lookup for trx_id == 0.
2018-07-03 15:10:06 +03:00
Marko Mäkelä
a90100d756 Replace univ_page_size and UNIV_PAGE_SIZE
Try to use one variable (srv_page_size) for innodb_page_size.

Also, replace UNIV_PAGE_SIZE_SHIFT with srv_page_size_shift.
2018-04-28 20:45:45 +03:00
Marko Mäkelä
ba19764209 Fix most -Wsign-conversion in InnoDB
Change innodb_buffer_pool_size, innodb_fill_factor to unsigned.
2018-04-28 20:45:45 +03:00
Marko Mäkelä
8861d442d6 Correct member function comments 2018-04-28 20:37:31 +03:00