dict_stats_shutdown() can hang, waiting for timer callback to finish.
This happens because locks the same mutex, which can also used inside
timer callback, within dict_stats_schedule() function.
Fix is to make dict_stats_schedule() use mutex.try_lock() instead of
mutex.lock().
In the unlikely case of simultaneous dict_stats_schedule() setting
different timer delays, now the first one would win, which is fine.
Important is that shutdown won't hang.
At each mini-transaction commit, the log sequence number of the
mini-transaction must be written to each modified page, so that
it will be available in the FIL_PAGE_LSN field when the page is
being read in crash recovery.
InnoDB was unnecessarily allocating redundant storage for the
field, in buf_page_t::newest_modification. Let us access
FIL_PAGE_LSN directly.
Furthermore, on ALTER TABLE...IMPORT TABLESPACE, let us write
0 to FIL_PAGE_LSN instead of using log_sys.lsn.
buf_flush_init_for_writing(), buf_flush_update_zip_checksum(),
fil_encrypt_buf_for_full_crc32(), fil_encrypt_buf(),
fil_space_encrypt(): Remove the parameter lsn.
buf_page_get_newest_modification(): Merge with the only caller.
buf_tmp_reserve_compression_buf(), buf_tmp_page_encrypt(),
buf_page_encrypt(): Define static in the same compilation unit
with the only caller.
PageConverter::m_current_lsn: Remove. Write 0 to FIL_PAGE_LSN
on ALTER TABLE...IMPORT TABLESPACE.
Implement syntax like:
create table t1 (x int) with system versioning partition by system_time;
which will create 1 history partition and 1 current partition.
Also it is possible to specify the number of history partitions:
create table t1 (x int) with system versioning partition by system_time partitions 5;
which will create 4 history partitions (and 1 current partition).
Tests:
partition.test cases are duplicated where it is appropriate for default partitions.
partition_rotation.test cases are replaced by default partitions where possible.
To reduce the difference between sql_yacc.yy and sql_yacc_ora.yy,
using yyerror() in both files, instead of MYSQLerror() and ORAerror().
The pre-processor replaces yyerror() to MYSQLerror() and ORAerror()
anyway.
Add support of referential constraints directly in column defininions:
create table t1 (id1 int primary key);
create table t2 (id2 int references t1(id1));
Referenced field name can be omitted if equal to foreign field name:
create table t1 (id int primary key);
create table t2 (id int references t1);
Until 10.5 this syntax was understood by the parser but was silently
ignored.
In case of generated columns this syntax is disabled at parser level
by ER_PARSE_ERROR. Note that separate FOREIGN KEY clause for generated
columns is disabled at storage engine level.
Currently InnoDB uses internal parser for adding foreign keys. Remove
internal parser and use data parsed by SQL parser (sql_yacc) for
adding foreign keys.
- create_table_info_t::create_foreign_keys() replacement for
dict_create_foreign_constraints_low();
- Pass constraint name via Foreign_key object.
Temporary until MDEV-20865:
- Pass alter_info as part of create_info.
btr_cur_instant_init_low(): Accurately parse the metadata record
header for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT. CHAR columns
used to be unnecessarily written as nonempty strings of bytes.
For ROW_FORMAT=REDUNDANT, we must reserve fixed-length dummy values
for the CHAR columns in the metadata record. This is because in
MariaDB Server 10.4, btr_cur_instant_init_low() will rely on
dict_index_t::trx_id_offset being accurate for the metadata record.
In MariaDB Server 10.4, btr_cur_instant_init_low() assumes that
all PRIMARY KEY columns that are internally variable-length will
be encoded in 0 bytes in the metadata record. Sometimes, CHAR
columns can be encoded as variable-length. We should not
unnecessarily reserve space for a dummy string value in the
metadata record.
On order to unify the two *.yy files easier,
this patch collects all different rules to the end of *.yy files,
so the rule section looks like this:
%%
common rules
different rules
DropIndex, CreateIndex: Remove. The file row0trunc.cc only exists
in MariaDB Server 10.3 so that the crash recovery of TRUNCATE TABLE
operations from older 10.2 and 10.3 servers will work. This dead code
was being used for implementing the MySQL 5.7 WL#6501 TRUNCATE TABLE
that was replaced with a backup-safe implementation in MDEV-13564.
buf_read_ibuf_merge_pages(): Discard any page numbers that are
outside the current bounds of the tablespace, by invoking the
function ibuf_delete_recs() that was introduced in MDEV-20934.
This could avoid an infinite change buffer merge loop on
innodb_fast_shutdown=0, because normally the change buffer merge
would only be attempted if a page was successfully loaded into
the buffer pool.
dict_drop_index_tree(): Add the parameter trx_t*.
To prevent the DROP TABLE crash, do not invoke btr_free_if_exists()
if the entire .ibd file will be dropped. Thus, we will avoid a crash
if the BTR_SEG_LEAF or BTR_SEG_TOP of the index is corrupted,
and we will also avoid unnecessarily accessing the to-be-dropped
tablespace via the buffer pool.
In MariaDB 10.2, we disable the DROP TABLE fix if innodb_safe_truncate=0,
because the backup-unsafe MySQL 5.7 WL#6501 form of TRUNCATE TABLE
requires that the individual pages be freed inside the tablespace.
When commit 09af00cbde
removed the crash-upgrade logic of old TRUNCATE TABLE
from MariaDB 10.2 and 10.3, it actually made the return
value of dict_drop_index_tree() redundant.
Adding:
- new class sp_expr_lex
- new grammar rule expr_lex, which includes both reset_lex()
and its corresponding restore_lex()
Also:
- Moving a few methods from LEX to sp_expr_lex.
- Moving the code from *.yy to new method sp_expr_lex methods
sp_repeat_loop_finalize() and sp_if_expr().
This change makes it easier to edit the related grammar
(and makes it easier to unify sql_yacc.yy and sql_yacc_ora.yy later).
Fix partitioning and DS-MRR to work together
- In ha_partition::index_end(): take into account that ha_innobase (and
other engines using DS-MRR) will have inited=RND when initialized for
DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
about how much memory their MRR implementation needs by passing
*buffer_size=0. DS-MRR code didn't know about this (actually it used
uint for buffer size calculation and would have an under-flow).
Returning *buffer_size=0 made ha_partition assume that partitions do
not need MRR memory and pass the same buffer to each of them.
Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
the amount of buffer space needed, but not more than about
@@mrr_buffer_size.
* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
partitions, and partition use DS-MRR, the code will call handler->clone
with TABLE (*NOT partition*) name as an argument.
DS-MRR has no way of knowing the partition name, so the solution was
to have the ::clone() function for the affected storage engine to ignore
the name argument and get it elsewhere.
After MDEV-11556, not even crash recovery should attempt to access
non-existing pages. But, buf_load() is not validating its input
and must thus be able to ignore missing pages, so that is why
buf_read_page_background() does that.
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.
- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling
- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.
- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.
- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.
- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
The library is capable of
- asynchronous execution of tasks (and optionally waiting for them)
- asynchronous file IO
This is implemented using libaio on Linux and completion ports on
Windows. Elsewhere, async io is "simulated", which means worker threads
are performing synchronous IO.
- timers, scheduling work asynchronously in some point of the future.
Also periodic timers are implemented.
This is a prerequisite patch required to remove Innodb's
thd_destructor_proxy thread.
The patch implement pre-shutdown functionality for handlers.
A storage engine might need to perform some work after all user
connections are shut down, but before killing off the plugins.
The reason is that an SE could still be using some of the
server infrastructure. In case of Innodb this would be purge threads,
that call into the server to calculate results of virtual function,
acquire MDL locks on tables, or possibly also use the audit plugins.
Create background THDs that are not counted, and do not block
close_connections(), in other words they are just something that plugin
can use internally.
These THD will be used later by Innodb purge threads, and they need THD
to calculate virtual functions.
Threadpool will need a functionality for periodic thr_timer
(the threadpool maintainence task is a timer that runs periodically).
Also increase the stack size for the timer thread, 8k won't be enough.
- Added dynamic column width fro `user` and `db` columns
- Added option `%` to hide the progress
- Follow up changes from Ubuntu
- Take `INFORMATION_SCHEMA.PROCESSLIST` into account to determine
progress of the `actual stage`
- Indentation fixes
Closes#215