Before that line there is call to buf_page_get_gen that could
return block = NULL when decrypting a page fails. However,
we should set error to be != DB_SUCCESS also. In error log
there was error about decompression but in that code there
is one case where error is not set correctly.
innobase_add_instant_try(): If the leftmost leaf page does not contain
other records than the 'default row', only empty the table if there
are no successor pages.
When a table or partition which was not empty during a previous
instant ADD COLUMN became empty later, and now with this subsequent
instant ADD COLUMN we have the opportunity to convert the empty table
or partition to 'non-instant' format.
Similarly, if the table or partition is empty to begin with, that is,
it does not even contain a 'default row' record, we can use the
'non-instant' format.
Compiler optimizations were switched off due to
MySQL Bug #19424, #36366, #34297, due to an alleged compiler bug.
No proper analysis of code generation was done back then, thus proof of
a compiler bug is missing.
Even if there was a compiler bug 13 years ago, it could have been fixed.
Will wait and see if there are any complains or crashes
Rollback attempted to dereference DB_ROLL_PTR=0, which cannot possibly
be a valid undo log pointer. A safe canonical value would be
roll_ptr_t(1) << ROLL_PTR_INSERT_FLAG_POS
which is what was chosen in MDEV-12288.
This bug was reproduced in 10.3 only. Potentially, the problem could
have been introduced by MDEV-11415, which suppresses undo logging for
ALGORITHM=COPY operations. In those operations, we should actually
have written the safe value of DB_ROLL_PTR instead of writing 0.
However, the test in commit 5421e3aee7
demonstrates that access to the rebuilt table by earlier-started
transactions should actually have been refused with ER_TABLE_DEF_CHANGED.
btr_cur_ins_lock_and_undo(): When undo logging is disabled, use the
safe value of DB_ROLL_PTR.
btr_cur_optimistic_insert(): Validate the DB_TRX_ID,DB_ROLL_PTR before
inserting into a clustered index leaf page.
ins_node_t::sys_buf[]: Replaces row_id_buf and trx_id_buf and some
heap usage.
row_ins_alloc_sys_fields(): Initialize ins_node_t::sys_buf[].
trx_undo_page_report_modify(): Assert that the DB_ROLL_PTR is not 0.
trx_undo_get_undo_rec_low(): Assert that the roll_ptr is valid before
trying to dereference it.
dict_index_t::is_primary(): Check if the index is the primary key.
PageConverter::adjust_cluster_record(): Instead of writing
the invalid value DB_ROLL_PTR=0, write a value that indicates
a fresh insert, that is, prevents the DB_ROLL_PTR from being
dereferenced in any circumstances.
It can be argued that IMPORT TABLESPACE should actually
update the dict_index_t::trx_id to prevent older transactions
from accessing the table, similar to what I did on table
rebuild in MySQL 5.6.6 in
03f81a55f2
trx_undo_rec_copy(): Use a debug assertion. In a non-debug build,
with len<0 (which this assertion is really testing for)
we should still typically crash due to running out of memory.
Replace all occurrences of the is_clust() method with is_primary(),
because that is what is actually meant. (Also the change buffer
tree would count as a clustered index.)
Rollback attempted to dereference DB_ROLL_PTR=0, which cannot possibly
be a valid undo log pointer. A safer canonical value would be
roll_ptr_t(1) << ROLL_PTR_INSERT_FLAG_POS
which is what was chosen in MDEV-12288, corresponding to reset_trx_id.
No deterministic test case for the bug was found. The simplest test
cases may be related to MDEV-11415, which suppresses undo logging for
ALGORITHM=COPY operations. In those operations, in the spirit of
MDEV-12288, we should actually have written reset_trx_id instead of
using the transaction identifier of the current transaction
(and a bogus value of DB_ROLL_PTR=0). However, thanks to MySQL Bug#28432
which I had fixed in MySQL 5.6.8 as part of WL#6255, access to the
rebuilt table by earlier-started transactions should actually have been
refused with ER_TABLE_DEF_CHANGED.
reset_trx_id: Move the definition to data0type.cc and the declaration
to data0type.h.
btr_cur_ins_lock_and_undo(): When undo logging is disabled, use the
safe value that corresponds to reset_trx_id.
btr_cur_optimistic_insert(): Validate the DB_TRX_ID,DB_ROLL_PTR before
inserting into a clustered index leaf page.
ins_node_t::sys_buf[]: Replaces row_id_buf and trx_id_buf and some
heap usage.
row_ins_alloc_sys_fields(): Init ins_node_t::sys_buf[] to reset_trx_id.
row_ins_buf(): Only if undo logging is enabled, copy trx->id
to node->sys_buf. Otherwise, rely on the initialization in
row_ins_alloc_sys_fields().
row_purge_reset_trx_id(): Invoke mlog_write_string() with reset_trx_id
directly. (No functional change.)
trx_undo_page_report_modify(): Assert that the DB_ROLL_PTR is not 0.
trx_undo_get_undo_rec_low(): Assert that the roll_ptr is valid before
trying to dereference it.
dict_index_t::is_primary(): Check if the index is the primary key.
PageConverter::adjust_cluster_record(): Fix
MDEV-15249 Crash in MVCC read after IMPORT TABLESPACE
by resetting the system fields to reset_trx_id instead of writing
the current transaction ID (which will be committed at the
end of the IMPORT TABLESPACE) and DB_ROLL_PTR=0.
This can partially be viewed as a follow-up fix of MDEV-12288,
because IMPORT should already then have written
DB_TRX_ID=0 and DB_ROLL_PTR=1<<55 to prevent unnecessary
DB_TRX_ID lookups in subsequent accesses to the table.
MDEV-14222 Unnecessary 'cascade' memory allocation for every updated row
when there is no FOREIGN KEY
This reverts the MySQL 5.7.2 change
377774689b
which introduced these problems. MariaDB 10.2.2 inherited these problems
in commit 2e814d4702.
The FOREIGN KEY CASCADE and SET NULL operations implemented as
procedural recursion are consuming more than 8 kilobytes of stack
(9 stack frames) per iteration in a non-debug GNU/Linux AMD64 build.
This is why we need to limit the maximum recursion depth to 15 steps
instead of the 255 that it used to be in MySQL 5.7 and MariaDB 10.2.
A corresponding change was made in MySQL 5.7.21 in
7b26dc98a6
This corruption was introduced in MDEV-13331. It would have been caught
by the MySQL 5.7 test innodb.update-cascade which MariaDB was missing
until now.
row_ins_check_foreign_constraint(): Never replace err == DB_LOCK_WAIT
with other values than DB_LOCK_WAIT_TIMEOUT.
row_ins_cascade_calc_update_vec(): Remove the output parameter
fts_col_affected, and instead return whether any fulltext index
is affected by the cascade operation.
row_ins_foreign_check_on_constraint(): Narrow the scope of some
variables.
ib_dec_in_dtor: Remove.
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
Problem was that wrong error message was returned when insert
returned FK-error and there was no duplicate key to process.
row_ins
If error from insert was DB_NO_REFERENCED_ROW and there was
no duplicate key we should ignore ON DUPLICATE KEY UPDATE
and return original error message.
Previously, the function could theoretically return an uninitialized
value if the system tablespace contained no data files. It should be
impossible for InnoDB to start up in such scenario.
Two follow-up tasks were filed for MySQL 5.7.21 changes that
were not applied here:
MDEV-15179 performance_schema.file_instances does not reflect RENAME TABLE
MDEV-14222 Unnecessary 'cascade' memory allocation for every updated
row when there is no FOREIGN KEY
The merge omitted some InnoDB and XtraDB conflict resolutions,
most notably, failing to merge the fix of MDEV-12173.
ibuf_merge_or_delete_for_page(), lock_rec_block_validate():
Invoke fil_space_acquire_silent() instead of fil_space_acquire().
This fixes MDEV-12173.
wsrep_debug, wsrep_trx_is_aborting(): Removed unused declarations.
_fil_io(): Remove. Instead, declare default parameters for the XtraDB
fil_io().
buf_read_page_low(): Declare default parameters, and clean up some
callers.
os_aio(): Correct the macro that is defined when !UNIV_PFS_IO.
Bug#23590280 NO WARNING WHEN REDUCING INNODB_BUFFER_POOL_SIZE
INSIZE (sic) THE FIRST CHUNK
innodb_buffer_pool_size_validate(): Issue a warning if the
requested innodb_buffer_pool_size is less than
innodb_buffer_pool_chunk_size, because we cannot shrink individual
chunks.
Import and adjust the innodb.innodb_buffer_pool_resize tests,
except innodb.innodb_buffer_pool_resize_debug, which would time out.
buf_pool_clear_hash_index(): Adjust assertions.
The algorithm change is based on a MySQL 8.0 fix for
BUG #26818787: ASSERTION: DATA0DATA.IC:430:TUPLE
by Krzysztof Kapuścik
ee606e62bb
If a record had been inserted in place of a delete-marked purgeable
record by modifying that record, and purge was accessing that record
before the off-page columns were written, row_build_index_entry()
would have returned NULL, causing a crash.
row_vers_non_virtual_fields_equal(): Check whether all non-virtual fields
of an index are equal. Replaces row_vers_non_vc_match(). A more complex
version of this function was called row_vers_non_vc_index_entry_match()
in the MySQL 8.0 fix.
row_vers_impl_x_locked_low(): This change is not directly related to
the reported problem, but apparently to the removal of the function
row_vers_non_vc_match(). This function checks if a secondary index
record was modified by a transaction that has not been committed yet.
For comparing the non-virtual columns, construct a secondary index
tuple from the table row.
row_vers_vc_matches_cluster(): Replace row_vers_non_vc_match() with
code that is equivalent to the row_vers_non_vc_index_entry_match()
in the MySQL 8.0 fix. Also, deduplicate some code by using goto.
The comment that I made in
commit 06299dddd4
is inaccurate. Replace the comment, and make the assertion
debug-only, because I cannot remember any reports of
it ever failing in these 10 years.
If crypt_block != NULL the entire object crypt_pfx should be
guaranteed to be initialized, including m_size, which will have been
initialized either in allocate_large(), either directly or via
allocate_trace().
With trx_sys_t::rw_trx_ids removal, MVCC snapshot overhead became
slightly higher. That is instead of copying an array we now have to
iterate LF_HASH. All this done under trx_sys.mutex protection.
This patch moves MVCC snapshot out of trx_sys.mutex.
Clean-ups:
Removed MVCC: doesn't make too much sense to keep it in a separate class
anymore.
Refactored ReadView so that it now calls register()/deregister() routines
(it was vice versa before).
ReadView doesn't have friends anymore. :(
Even less trx_sys.mutex references.
serialisation_list was supposed to instantly give minimum registered
transaction serialisation number. However maintaining and accessing
this list requires global mutex protection.
Since we already take MVCC snapshot by iterating trx_sys_t::rw_trx_hash,
it is cheap to integrate minimum registered transaction lookup into this
iteration.
Take snapshot of registered read-write transaction identifiers directly
from rw_trx_hash. It immediately saves one trx_sys.mutex lock, reduces
size of another critical section protected by this mutex, and makes
further optimisations like removing trx_sys_t::serialisation_list
possible.
Downside of this approach is bigger overhead for view opening, because
iterating LF_HASH is more expensive compared to taking snapshot of an
array. However for low concurrency overhead difference is negligible,
while for high concurrency mutex is much bigger evil.
Currently we still take trx_sys.mutex to serialise ReadView creation.
This is required to keep serialisation_list ordered by trx->no as well
as not to let purge thread to create more recent snapshot while another
thread gets suspended during creation of older snapshot. This will
become completely mutex free along with serialisation_list removal.
Compared to previous implementation removing element from rw_trx_hash
and serialisation_list is not atomic. We disregard all possible bad
consequences (if there're any) since it will be solved along with
serialisation_list removal.
trx_undo_mem_create_at_db_start(): Do not read TRX_UNDO_TRX_NO
unless the field is known to be valid, that is, the transaction
has been serialized and trx_purge_add_undo_to_history() has been
invoked.
Normally InnoDB pages would be zero-initialized on allocation
(since MySQL 5.5 or so), but the undo log pages skip that
mechanism. So, reused undo log pages can contain garbage.
Undo log headers can start at any offset (there can be
multiple undo log headers in the same undo log page).
Therefore, because the TRX_UNDO_TRX_NO is never explicitly
initialized on undo log header creation, its contents may
be garbage.
When InnoDB has completed the rollback of a recovered transaction,
it used to display the transaction identifier.
This was broken in MySQL 5.7.2 in
2f5f3cd3ac
which was merged to MariaDB 10.2.2 in
commit 2e814d4702.
trx_rollback_active(): Cache the transaction ID before it will be
reset by transaction commit. Do not display the message if the
rollback was interrupted by shutdown (MDEV-13797, MDEV-12352).
trx_write_serialisation_history(): Only invoke trx_sysf_get()
to exclusively lock the TRX_SYS page if some change really
has to be written to the page.
On transaction commit, we will still write some binlog and
Galera WSREP XID information.
FIXME: If this information has to be written, it should be
partitioned into the rollback segment pages.
InnoDB maintains an internal persistent sequence of transaction
identifiers. This sequence is used for assigning both transaction
start identifiers (DB_TRX_ID=trx->id) and end identifiers (trx->no)
as well as end identifiers for the mysql.transaction_registry table
that was introduced in MDEV-12894.
TRX_SYS_TRX_ID_WRITE_MARGIN: Remove. After this many updates of
the sequence we used to update the TRX_SYS page. We can avoid accessing
the TRX_SYS page if we modify the InnoDB startup so that resurrecting
the sequence from other pages of the transaction system.
TRX_SYS_TRX_ID_STORE: Deprecate. The field only exists for the purpose
of upgrading from an earlier version of MySQL or MariaDB.
Starting with this fix, MariaDB will rely on the fields
TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO in the undo log header page of
each non-committed transaction, and on the new field
TRX_RSEG_MAX_TRX_ID in rollback segment header pages.
Because of this change, setting innodb_force_recovery=5 or 6 may cause
the system to recover with trx_sys.get_max_trx_id()==0. We must adjust
checks for invalid DB_TRX_ID and PAGE_MAX_TRX_ID accordingly.
We will change the startup and shutdown messages to display the
trx_sys.get_max_trx_id() in addition to the log sequence number.
trx_sys_t::flush_max_trx_id(): Remove.
trx_undo_mem_create_at_db_start(), trx_undo_lists_init():
Add an output parameter max_trx_id, to be updated from
TRX_UNDO_TRX_ID, TRX_UNDO_TRX_NO.
TRX_RSEG_MAX_TRX_ID: New field, for persisting
trx_sys.get_max_trx_id() at the time of the latest transaction commit.
Startup is not reading the undo log pages of committed transactions.
We want to avoid additional page accesses on startup, as well as
trouble when all undo logs have been emptied.
On startup, we will simply determine the maximum value from all pages
that are being read anyway.
TRX_RSEG_FORMAT: Redefined from TRX_RSEG_MAX_SIZE.
Old versions of InnoDB wrote uninitialized garbage to unused data fields.
Because of this, we cannot simply introduce a new field in the
rollback segment pages and expect it to be always zero, like it would
if the database was created by a recent enough InnoDB version.
Luckily, it looks like the field TRX_RSEG_MAX_SIZE was always written
as 0xfffffffe. We will indicate a new subformat of the page by writing
0 to this field. This has the nice side effect that after a downgrade
to older versions of InnoDB, transactions should fail to allocate any
undo log, that is, writes will be blocked. So, there is no problem of
getting corrupted transaction identifiers after downgrading.
trx_rseg_t::max_size: Remove.
trx_rseg_header_create(): Remove the parameter max_size=ULINT_MAX.
trx_purge_add_undo_to_history(): Update TRX_RSEG_MAX_SIZE
(and TRX_RSEG_FORMAT if needed). This is invoked on transaction commit.
trx_rseg_mem_restore(): If TRX_RSEG_FORMAT contains 0,
read TRX_RSEG_MAX_SIZE.
trx_rseg_array_init(): Invoke trx_sys.init_max_trx_id(max_trx_id + 1)
where max_trx_id was the maximum that was encountered in the rollback
segment pages and the undo log pages of recovered active, XA PREPARE,
or some committed transactions. (See trx_purge_add_undo_to_history()
which invokes trx_rsegf_set_nth_undo(..., FIL_NULL, ...);
not all committed transactions will be immediately detached from the
rollback segment header.)
trx_rseg_mem_restore(): Update the max_trx_id from the undo log pages.
trx_sys_init_at_db_start(): Remove; merge with trx_lists_init_at_db_start().
trx_undo_lists_init(): Move to the only calling module, trx0rseg.cc.
trx_undo_mem_create_at_db_start(): Declare globally. Return the number
of pages.
trx_undo_page_get_prev_rec(), trx_undo_page_get_last_rec(),
trx_undo_page_get_first_rec(), trx_undo_page_get_start():
Move to the only caller, trx0undo.cc.
Add some const qualifiers.
trx_sysf_t: Remove.
trx_sysf_get(): Return the TRX_SYS page, not a pointer within it.
trx_sysf_rseg_get_space(), trx_sysf_rseg_get_page_no():
Remove a parameter, and merge the declaration and definition.
Take the TRX_SYS page as a parameter.
TRX_SYS_N_RSEGS: Correct the comment.
trx_sysf_rseg_find_free(), trx_sys_update_mysql_binlog_offset(),
trx_sys_update_wsrep_checkpoint(): Take the TRX_SYS page as a parameter.
trx_rseg_header_create(): Add a parameter for the TRX_SYS page.
trx_sysf_rseg_set_space(), trx_sysf_rseg_set_page_no(): Remove;
merge to the only caller, trx_rseg_header_create().
srv_init_abort_low(): Call srv_shutdown_bg_undo_sources() so that if
startup aborts while creating InnoDB system tables, the shutdown will
proceed correctly.
This was done in, among other things:
- thd->db and thd->db_length
- TABLE_LIST tablename, db, alias and schema_name
- Audit plugin database name
- lex->db
- All db and table names in Alter_table_ctx
- st_select_lex db
Other things:
- Changed a lot of functions to take const LEX_CSTRING* as argument
for db, table_name and alias. See init_one_table() as an example.
- Changed some function arguments from LEX_CSTRING to const LEX_CSTRING
- Changed some lists from LEX_STRING to LEX_CSTRING
- threads_mysql.result changed because process list_db wasn't always
correctly updated
- New append_identifier() function that takes LEX_CSTRING* as arguments
- Added new element tmp_buff to Alter_table_ctx to separate temp name
handling from temporary space
- Ensure we store the length after my_casedn_str() of table/db names
- Removed not used version of rename_table_in_stat_tables()
- Changed Natural_join_column::table_name and db_name() to never return
NULL (used for print)
- thd->get_db() now returns db as a printable string (thd->db.str or "")
MDEV-11415 Remove excessive undo logging during ALTER TABLE…ALGORITHM=COPY
Move a test from innodb.rename_table_debug to innodb.alter_copy.
ha_innobase::extra(HA_EXTRA_BEGIN_ALTER_COPY): Register id-versioned
tables so that mysql.transaction_registry will be updated, even for
empty tables that are subjected to ALTER TABLE…ALGORITHM=COPY.
If a crash occurs during ALTER TABLE…ALGORITHM=COPY, InnoDB would spend
a lot of time rolling back writes to the intermediate copy of the table.
To reduce the amount of busy work done, a work-around was introduced in
commit fd069e2bb3 in MySQL 4.1.8 and 5.0.2,
to commit the transaction after every 10,000 inserted rows.
A proper fix would have been to disable the undo logging altogether and
to simply drop the intermediate copy of the table on subsequent server
startup. This is what happens in MariaDB 10.3 with MDEV-14717,MDEV-14585.
In MariaDB 10.2, the intermediate copy of the table would be left behind
with a name starting with the string #sql.
This is a backport of a bug fix from MySQL 8.0.0 to MariaDB,
contributed by jixianliang <271365745@qq.com>.
Unlike recent MySQL, MariaDB supports ALTER IGNORE. For that operation
InnoDB must for now keep the undo logging enabled, so that the latest
row can be rolled back in case of an error.
In Galera cluster, the LOAD DATA statement will retain the existing
behaviour and commit the transaction after every 10,000 rows if
the parameter wsrep_load_data_splitting=ON is set. The logic to do
so (the wsrep_load_data_split() function and the call
handler::extra(HA_EXTRA_FAKE_START_STMT)) are joint work
by Ji Xianliang and Marko Mäkelä.
The original fix:
Author: Thirunarayanan Balathandayuthapani <thirunarayanan.balathandayuth@oracle.com>
Date: Wed Dec 2 16:09:15 2015 +0530
Bug#17479594 AVOID INTERMEDIATE COMMIT WHILE DOING ALTER TABLE ALGORITHM=COPY
Problem:
During ALTER TABLE, we commit and restart the transaction for every
10,000 rows, so that the rollback after recovery would not take so long.
Fix:
Suppress the undo logging during copy alter operation. If fts_index is
present then insert directly into fts auxiliary table rather
than doing at commit time.
ha_innobase::num_write_row: Remove the variable.
ha_innobase::write_row(): Remove the hack for committing every 10000 rows.
row_lock_table_for_mysql(): Remove the extra 2 parameters.
lock_get_src_table(), lock_is_table_exclusive(): Remove.
Reviewed-by: Marko Mäkelä <marko.makela@oracle.com>
Reviewed-by: Shaohua Wang <shaohua.wang@oracle.com>
Reviewed-by: Jon Olav Hauglid <jon.hauglid@oracle.com>
Remove unnecessary repeated lookups for undo pages.
trx_undo_assign(), trx_undo_assign_low(), trx_undo_seg_create(),
trx_undo_create(): Return the undo log block to the caller.
Inside InnoDB, each mini-transaction that generates any redo log records
will acquire log_sys->mutex during mtr_t::commit() in order to copy the
records into the global log_sys->buf for writing into the redo log file.
For single-row transactions, this incurs quite a bit of overhead.
We would use two mini-transactions for writing a record into a
freshly updated undo log page. (Only if the undo record will
not fit in that page, then we will have to commit and restart
the mini-transaction.)
trx_undo_assign(): Assign undo log for a persistent transaction,
or return the already assigned one.
trx_undo_assign_low(): Assign undo log for an operation on a
persistent or temporary table.
trx_undo_create(), trx_undo_reuse_cached(): Remove redundant parameters.
Merge the logic from trx_undo_mark_as_dict_operation().
Only invoke set_versioned() on trx_id versioned tables.
dict_table_t::versioned_by_id(): New accessor, to determine if
a table is system versioned by transaction ID.
When cloning oldest view, don't copy ReadView::m_creator_trx_id.
It means that the owner thread is now allowed to access this member
without trx_sys.mutex protection.
To achieve this we have to keep ReadView::m_creator_trx_id in
ReadView::m_ids. This is required to not let purge thread process
records owned by transaction associated with oldest view.
It is only required if trsanction entered read-write mode before it's
view was created.
If transaction entered read-write mode after it's view was created
(trx_set_rw_mode()), purge thread won't be allowed to touch it because
m_low_limit_id >= m_creator_trx_id holds. Thus we don't have to add
this transaction id to ReadView::m_ids.
Cleanups:
ReadView::ids_t: don't seem to make any sense, just complicate matters.
ReadView::copy_trx_ids(): doesn't make sense anymore, integrated into
caller.
ReadView::copy_complete(): not needed anymore.
ReadView copy constructores: don't seem to make any sense.
trx_purge_truncate_history(): removed view argument, access
purge_sys->view directly instead.
Moved mutex locking inside lock_rec_lock().
Moved monitor increment out of mutex.
Moved assertions that don't require protection out of mutex.
Removed duplicate assertions.
Moved duplicate debug injections into lock_rec_lock().
Let monitor updates use relaxed memory order.
Return directly without maintaining variables in lock_rec_lock_slow().
Moved lock_rec_lock_fast() body into lock_rec_lock(): saves at least one
trx_mutex_enter(), one switch() plus some code was moved out of mutex.
InnoDB RNG maintains global state, causing otherwise unnecessary bus
traffic. Even worse this is cross-mutex traffic. That is different
mutexes suffer from contention.
Fixed delay of 4 was verified to give best throughput by OLTP update
index and read-write benchmarks on Intel Broadwell (2/20/40) and
ARM (1/46/46).
Traditionally, DROP TABLE and TRUNCATE TABLE discarded any locks that
may have been held on the table. This feels like an ACID violation.
Probably most occurrences of it were prevented by meta-data locks (MDL)
which were introduced in MySQL 5.5.
dict_table_t::n_foreign_key_checks_running: Reduce the number of
non-debug checks.
lock_remove_all_on_table(), lock_remove_all_on_table_for_trx(): Remove.
ha_innobase::truncate(): Acquire an exclusive InnoDB table lock
before proceeding. DROP TABLE and DISCARD/IMPORT were already doing
this.
row_truncate_table_for_mysql(): Convert the already started transaction
into a dictionary operation, and do not invoke lock_remove_all_on_table().
row_mysql_table_id_reassign(): Do not call lock_remove_all_on_table().
This function is only used in ALTER TABLE...DISCARD/IMPORT TABLESPACE,
which is already holding an exclusive InnoDB table lock.
TODO: Make n_foreign_key_checks running a debug-only variable.
This would require two fixes:
(1) DROP TABLE: Exclusively lock the table beforehand, to prevent
the possibility of concurrently running foreign key checks (which
would acquire a table IS lock and then record S locks).
(2) RENAME TABLE: Find out if n_foreign_key_checks_running>0 actually
constitutes a potential problem.
While the bug was reported as a regression of
MDEV-11025 Make number of page cleaner threads variable dynamic
in MariaDB Server 10.3, the code that MariaDB Server 10.2
inherited from MySQL 5.7.4 (WL#6642) looks prone to similar errors.
pc_flush_slot(): If there is no work to do, reset the is_requested
signal, to avoid potential busy-waiting in
buf_flush_page_cleaner_worker(). If the coordinator thread has shut
down, avoid resetting the is_requested event, to avoid a potential
hang at shutdown if there are multiple worker threads.
mem_heap_free_heap_top(): Remove UNIV_MEM_ASSERT_W() and unpoison
the memory region first, because part of it may have been poisoned
by an earlier mem_heap_free_top() call.
Poison the address range at the end.
mem_heap_block_free(): Poison the address range at the end.
UNIV_MEM_ASSERT_AND_ALLOC(): Replace with UNIV_MEM_ALLOC().
We want to keep the address ranges poisoned (unaccessible) as
long as possible.
UNIV_MEM_ASSERT_AND_FREE(): Replace with UNIV_MEM_FREE().
InnoDB is issuing a 'noise' message that is not a sign of abnormal
operation. The only issuers of it are the debug function
lock_rec_block_validate() and the change buffer merge.
While the error should ideally never occur in transactional locking,
we happen to know that DISCARD TABLESPACE and TRUNCATE TABLE and
possibly DROP TABLE are breaking InnoDB table locks.
When it comes to the change buffer merge, the message simply is useless
noise. We know perfectly well that a tablespace can be dropped while a
change buffer merge is pending. And the code is prepared to handle that,
which is demonstrated by the fact that whenever the message was issued,
InnoDB did not crash.
fil_inc_pending_ops(): Remove the parameter print_err.