btr_pcur_store_position(): Replace a too strict debug assertion.
It is possible to have a clustered index B-tree for a logically
empty table, which will consist of a node pointer from the root
page to a leaf page that contains the metadata record.
The too strict debug assertion was added in
commit 0e5a4ac253 (MDEV-15562).
For field with type INET, during EITS collection the min and max values are store in text
representation in the statistical table.
While retrieving the value from the statistical table, the value is stored back in the original
field using binary form instead of text and this was resulting in the crash.
Introduced 2 functions in the Field structure:
1) store_to_statistical_minmax_field
2) store_from_statistical_minmax_field
With MDEV-16678, InnoDB background tasks (most notably, the purge of
committed transaction history) can acquire metadata locks.
Because of this, the lock_wait_timeout=0 is too strict and must
be relaxed.
The test used to fail easily if an extra sleep was added to
the end of dict_table_close(), before the MDL release. Now,
with lock_wait_timeout=1, the test passes even with an extra
0.1-second sleep added to dict_table_close().
Thanks to Monty for providing this fix.
commit f74023b955 (MDEV-15090)
inadvertently removed a mtr_t::commit() call from
trx_undo_report_rename(), causing an InnoDB hang if
we failed to log a RENAME operation.
It is unclear whether this condition is possible in practice.
The test case involved SET GLOBAL innodb_trx_rseg_n_slots_debug=1
and a failed CREATE TABLE...SELECT, whose error handling would
internally invoke RENAME in InnoDB.
log_buf_pool_get_oldest_modification(): Acquire
log_sys_t::flush_order_mutex in order to prevent a race condition
that was introduced in
commit 1a6f708ec5 (MDEV-15058).
Before that change, log_buf_pool_get_oldest_modification()
was protected by both log_sys.mutex and log_sys.flush_order_mutex
like it was supposed to be ever since
commit a52c4820a3 (MySQL 5.5.10).
buf_pool_t::get_oldest_modification(): Replaces
buf_pool_get_oldest_modification(), to emphasize that
log_sys.flush_order_mutex must be acquired by the caller if needed.
log_close(): Invoke log_buf_pool_get_oldest_modification()
in order to ensure a clean shutdown.
The scenario of the race condition is as follows:
1. The buffer pool is clean (no writes are pending).
2. mtr_add_dirtied_pages_to_flush_list() releases log_sys.mutex.
3. log_buf_pool_get_oldest_modification() observes that the
buffer pool is clean and returns log_sys.lsn.
4. log_checkpoint() completes, writing a wrong checkpoint header
according to which everything up to log_sys.lsn was clean.
5. mtr_add_dirtied_pages_to_flush_list() completes the execution
of mtr_memo_note_modifications(), releases the page latches and
the flush_order_mutex.
6. On a subsequent log_checkpoint(), the assertion could fail
if the page modifications had not been flushed yet.
The failing assertion (which is valid) was added in MySQL 5.7
mysql/mysql-server@5c6c6ec693
and merged to MariaDB Server 10.2.2 in
commit fec844aca8.
For character sets and collation where character to weight mapping > 1,
there we need to make sure while creating a sort key,
a temporary buffer is created to store the value of the item by val_str function
and then copy that value back to the sort buffer.
In this case when using a priority queue Sort_param::tmp_buffer was not allocated.
Minor refactoring:
Changed Sort_param::tmp_buffer from char* to String
Problem:
=======
While evicting the uncompressed page from buffer pool, InnoDB writes
the checksum for the compressed page in buf_LRU_free_page().
So while flushing the compressed page, checksum validation fails
when innodb_checksum_algorithm variable changed to strict_none.
Solution:
========
- Calculate the checksum only during flushing of page. Removed the
checksum write in buf_LRU_free_page().
Item_sum_sp did not override val_native(). So the reported script
crashed in the default implementation in Item::val_native() on DBUG_ASSERT().
Implementing a correct Item_sum_sp::val_native().
Existing implementation used my_checksum (from mysys)
for calculating table checksum and binlog checksum.
This implementation was optimized for powerpc only and lacked
SIMD implementation for x86 (using clmul) and ARM
(using ACLE) instead used zlib-crc32.
mariabackup had its own copy of the crc32 implementation
using hardware optimized implementation only for x86 and lagged
hardware based implementation for powerpc and ARM.
Patch helps unifies all such calls and help aggregate all of them
using an unified interface my_checksum().
Said unification also enables hardware optimized calls for all
architecture viz. x86, ARM, POWERPC.
Default always fallback to zlib crc32.
Thanks to Daniel Black for reviewing, fixing and testing
PowerPC changes. Thanks to Marko and Daniel for early code feedback.
The opt_for_user subrule was incorrectly scanned before sp_create_assignment_lex(),
so the user name and the host were created on a wrong memory root.
- Reoganizing the grammar to make sure that sp_create_assignment_lex()
is called immediately after PASSWORD_SYM is scanned, so all attributes
are then allocated on its memory root.
- Moving the semantic code as methods to LEX, so the grammar looks as simple as possible.
- Changing text_or_password to be of the data type USER_AUTH*.
As a side effect, the LEX::definer member is now not used when processing
the SET PASSWORD statement. Everything is done using Bison's stack.
The bug sas introduced by this commit:
commit bf5a144e16
This change also affects information_schema.tables
The create table option "transactional=0 | 1" is now always shown for
storage engines that supports both transactional/crash safe tables and
non transactional tables.
Before this patch the transactional=... option was only shown if the user
specified transactional=... in the CREATE TABLE or ALTER TABLE statement.
The reason for the change was to be able to make it easy to know if an Aria
table is transactional or not.
Previously multiple threads were allowed to load histograms concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, histograms loading is protected by
per-TABLE_SHARE atomic variable.
Whenever histograms were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever histograms have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If histograms are being loaded
concurrently, statement waits until load is completed.
- Table_statistics::total_hist_size moved to TABLE_STATISTICS_CB: only
meaningful within TABLE_SHARE (not used for collected stats).
- TABLE_STATISTICS_CB::histograms_can_be_read and
TABLE_STATISTICS_CB::histograms_are_read are replaced with a tri state
atomic variable.
- Simplified away alloc_histograms_for_table_share().
Note: there's still likely a data race if a thread attempts accessing
histograms data after it failed to load it (because of concurrent load).
It was there previously and goes out of the scope of this effort. One way
of fixing it could be reviving TABLE::histograms_are_read and adding
appropriate checks whenever it is needed.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Previously multiple threads were allowed to load statistics concurrently.
There were no known problems caused by this. But given amount of data
races in this code, it'd happen sooner or later.
To avoid scalability bottleneck, statistics loading is protected by
per-TABLE_SHARE atomic variable.
Whenever statistics were loaded by preceding statement (hot-path), a
scalable load-acquire check is performed.
Whenever statistics have to be loaded anew, mutual exclusion for loaders
is established by atomic variable. If statistics are being loaded
concurrently, statement waits until load is completed.
TABLE_STATISTICS_CB::stats_can_be_read and
TABLE_STATISTICS_CB::stats_is_read are replaced with a tri state atomic
variable.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
Removed redundant loops, integrated logics into the caller instead.
Unified condition in read_statistics_for_tables(), less
"table_share != NULL" checks, no more potential "table_share == NULL"
dereferencing.
Part of MDEV-19061 - table_share used for reading statistical tables is
not protected
A crash was observed where dict_acquire_mdl_shared<trylock=false>
would invoke memcpy() with an apparently uninitialized tbl_len.
dict_table_t::parse_name(): Remove an unnecessary tbl_len--
operation. (This should be mostly non-functional cleanup.)
dict_acquire_mdl_shared(): If the second dict_table_t::parse_name()
returns false, terminate the loop just like we would do on the
first invocation.
In Item_nodeset_func_predicate::val_nodeset, args[1] is not necessarily
an Item_func descendant. It can be Item_bool.
Removing a wrong cast. It was not really needed anyway.
commit d09aec7a15 (MDEV-19940)
caused a regression. We made wait_lock_get_heap_no() return
uint16_t instead of ulint, and we mostly replaced the previous
magic value ULINT_UNDEFINED with 0. But, we failed to adjust
some assertions. Furthermore, 0 is a valid although rare value
for record locks. (Record locks can be temporarily stored on
page infimum in some operations that involve multiple leaf pages.)
Let us use 0xFFFF as the magic value. Valid heap numbers
are limited to less than 9362 = innodb_page_size/(5+1+1)
when using a minimal 1-byte PRIMARY KEY and a
secondary index on a NULL or '' column.