Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.
- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling
- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.
- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.
- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.
- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
The library is capable of
- asynchronous execution of tasks (and optionally waiting for them)
- asynchronous file IO
This is implemented using libaio on Linux and completion ports on
Windows. Elsewhere, async io is "simulated", which means worker threads
are performing synchronous IO.
- timers, scheduling work asynchronously in some point of the future.
Also periodic timers are implemented.
This is a prerequisite patch required to remove Innodb's
thd_destructor_proxy thread.
The patch implement pre-shutdown functionality for handlers.
A storage engine might need to perform some work after all user
connections are shut down, but before killing off the plugins.
The reason is that an SE could still be using some of the
server infrastructure. In case of Innodb this would be purge threads,
that call into the server to calculate results of virtual function,
acquire MDL locks on tables, or possibly also use the audit plugins.
Create background THDs that are not counted, and do not block
close_connections(), in other words they are just something that plugin
can use internally.
These THD will be used later by Innodb purge threads, and they need THD
to calculate virtual functions.
Threadpool will need a functionality for periodic thr_timer
(the threadpool maintainence task is a timer that runs periodically).
Also increase the stack size for the timer thread, 8k won't be enough.
- Added dynamic column width fro `user` and `db` columns
- Added option `%` to hide the progress
- Follow up changes from Ubuntu
- Take `INFORMATION_SCHEMA.PROCESSLIST` into account to determine
progress of the `actual stage`
- Indentation fixes
Closes#215
Apart from page latches (buf_block_t::lock), mini-transactions
are keeping track of at most one dict_index_t::lock and
fil_space_t::latch at a time, and in a rare case, purge_sys.latch.
Let us introduce interfaces for acquiring an index latch
or a tablespace latch.
In a later version, we may want to introduce mtr_t members
for holding a latched dict_index_t* and fil_space_t*,
and replace the remaining use of mtr_t::m_memo
with std::set<buf_block_t*> or with a map<buf_block_t*,byte*>
pointing to log records.
In the test innodb.instant_alter,4k we would be flagging an error
for too large row size. That error was previously only being reported
if the table was being rebuilt. Thus, this merge is fixing a small
omission in MDEV-11369 (instant ADD COLUMN).
revision-id: 673e253724979fd9fe43a4a22bd7e1b2c3a5269e
Author: Kristian Nielsen
Fix missing memory barrier in wait_for_commit.
The function wait_for_commit::wait_for_prior_commit() has a fast path where it
checks without locks if wakeup_subsequent_commits() has already been called.
This check was missing a memory barrier. The waitee thread does two writes to
variables `waitee' and `wakeup_error', and if the waiting thread sees the
first write it _must_ also see the second or incorrect behavior will occur.
This requires memory barriers between both the writes (release semantics) and
the reads (acquire semantics) of those two variables.
Other accesses to these variables are done under lock or where only one thread
will be accessing them, and can be done without barriers (relaxed semantics).
btr_create(), btr_root_raise_and_insert(): Write a MLOG_MEMSET record
to set FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL, instead of writing
two MLOG_4BYTES records.
For ROW_FORMAT=COMPRESSED pages, we will not use MLOG_MEMSET
because we want the crash-downgrade to earlier 10.4 releases to succeed.
mlog_parse_nbytes(): Relax the too strict assertion. There is no problem
with MLOG_MEMSET records that affect the uncompressed header of
ROW_FORMAT=COMPRESSED index pages.
Fix incorrect change introduced in the fix for MDEV-20109.
The patch tried to compute a more precise estimate for the record_count
value in SJ-Materialization-Scan strategy (in
Sj_materialization_picker::check_qep). However the new formula is worse
as it produces extremely optimistic results in common cases where
SJ-Materialization-Scan should be used)
The old formula produces pessimistic results in cases when Sj-Materialization-
Scan is unlikely to be a good choice anyway. So, the old behavior is better.
Move row size check to early CREATE/ALTER TABLE phase. Stop checking
on table open.
dict_index_add_to_cache(): remove parameter 'strict', stop checking row size
dict_index_t::record_size_info_t: this is a result of row size check operation
create_table_info_t::row_size_is_acceptable(): performs row size check.
Issues error or warning. Writes first overflow field to InnoDB log.
create_table_info_t::create_table(): add row size check
dict_index_t::record_size_info(): this is a refactored version
of dict_index_t::rec_potentially_too_big(). New version doesn't change global
state of a program but return all interesting info. And it's callers who
decide how to handle row size overflow.
dict_index_t::rec_potentially_too_big(): removed
A search with PAGE_CUR_GE may land on the supremum record on
a leaf page that is not the rightmost leaf page.
This could occur when all keys on the current page are
smaller than the search key, and the smallest key on the
successor page is larger than the search key.
ibuf_delete_recs(): Correct the debug assertion accordingly.
mtr_t::Impl, mtr_t::Command: Merge to mtr_t.
MTR_MAGIC_N: Remove.
MTR_STATE_COMMITTING: Remove. This state was only being set
internally during mtr_t::commit().
mtr_t::Command::m_locks_released: Remove (set-and-never-read member).
mtr_t::Command::m_start_lsn: Replaced with the return value of
finish_write() and a parameter to release_blocks().
mtr_t::Command::m_end_lsn: Removed as a duplicate of mtr_t::m_commit_lsn.
mtr_t::Command::prepare_write(): Replace a switch () with a
comparison against 0. Only 2 m_log_mode are allowed.
Count the "gap" time between table accesses and display it as
r_other_time_ms in the "table" element.
* The advantage of this approach is that it doesn't add any new
my_timer_cycles() calls.
* The disadvantage is that the definition of what is done during
"other time" is not that clear: it includes checking the WHERE
(for this table), constructing index lookup tuple (for the next table)
writing to GROUP BY temporary table (as we dont account for that time
separately [yet], etc)
Problem:
========
CURRENT_TEST: binlog_encryption.rpl_corruption
mysqltest: In included file "./include/wait_for_slave_io_error.inc":
...
At line 72: Slave stopped with wrong error code
**** Slave stopped with wrong error code: 1743 (expected 1595,1913) ****
Analysis:
========
The test emulates the corruption at the various stages of replication for
example in binlog file, in network and in relay log etc. It verifies that all
corruption cases are handled through appropriate error messages.
The test cases which emulate network failure expect following errors.
--ER_SLAVE_RELAY_LOG_WRITE_FAILURE (1595)
--ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE (1743)
Ideally test should expect error codes as 1595 and 1743.
But the test actually waits on incorrect error code 1595,1913
Fix:
===
Added appropriate error code for 'ER_NETWORK_READ_EVENT_CHECKSUM_FAILURE'.
Replaced 1913 with 1743.
New iterator has the fastest possible implementation: just moves one pointer.
It's faster that List_iterator and List_iterator_fast: both do more on increment.
Overall patch brings:
1) work compile times
2) possibly(!) worse debug build performance
3) definitely better optimized build performance
4) ability to write less code
5) ability to write less bug-prone code
fseg_create(): Initialize FSEG_FRAG_ARRY by a single MLOG_MEMSET record.
flst_zero_addr(), flst_init(): Optimize away redundant writes.
fseg_free_page_low(): Write FIL_NULL by MLOG_MEMSET.
The XDES_CLEAN_BIT is always set for every element of
the page allocation bitmap in the extent descriptor pages.
Do not bother touching it, to avoid redundant writes.
page_rec_write_field(): Remove.
dict_create_index_tree_step(): If the SYS_INDEXES.PAGE does not change,
do not update it in the data dictionary. Typically, all index page numbers
would be unchanged before and after IMPORT TABLESPACE, except if some
secondary indexes were created after loading some data.
btr_root_fseg_adjust_on_import(): Remove the redundant mtr_t* parameter.
Redo logging is disabled during the page adjustments that IMPORT TABLESPACE
is performing.