Based on mysql/mysql-server@bc9c46bf28
but without sleeps.
The test was verified to hit the debug assertion if the change to
fts_add_doc_by_id() in commit 2d98b967e3
was reverted.
fts_cache_t::total_size_at_sync: New field, to sample total_size.
fts_add_doc_by_id(): Invoke sync if total_size has grown too much
since the previous sync request. (Maintain cache->total_size_at_sync.)
ib_wqueue_t::length: Caches ib_list_len(*items).
ib_wqueue_len(): Removed. We will refer to fts_optimize_wq->length
directly.
Based on mysql/mysql-server@bc9c46bf28
trx_commit_in_memory(): Do not release the rseg reference before
trx_undo_commit_cleanup() has been invoked and the current transaction
is truly done with the rollback segment. The purpose of the reference
count is to prevent data races with trx_purge_truncate_history().
This is based on
mysql/mysql-server@ac79aa1522.
InnoDB commit fails when consecutive FTS_DOC_ID value
is greater than 4294967295.
Fix is that InnoDB should remove the delta FTS_DOC_ID
value limitations and fts should encode 8 byte value,
remove FTS_DOC_ID_MAX_STEP variable. Replaced the
fts0vlc.ic file with fts0vlc.h
fts_encode_int(): Should be able to encode 10 bytes value
fts_get_encoded_len(): Should get the length of the value
which has 10 bytes
fts_decode_vlc(): Add debug assertion to verify the maximum
length allowed is 10.
mach_read_uint64_little_endian(): Reads 64 bit stored in
little endian format
Added a unit test case which check for minimum and maximum
value to do the fts encoding
In commit 1811fd51fb the assertion
should have said error_reported instead of !error_reported.
But, that revised assertion would still fail in main.defaults
where ER_BAD_DATA is reported during CREATE TABLE.
Assertion `!pk->has_virtual()' failed in dict_index_build_internal_clust
while creating PRIMARY key longer than possible to store in the page.
This happened because the key was wrongly deduced as Long UNIQUE supported,
however PRIMARY KEY cannot be of that type. The main reason is that
only 8 bytes are used to store the hash, see HA_HASH_FIELD_LENGTH.
This is also why HA_NOSAME flag is removed (and caused the assertion in
turn) in open_table_from_share:
if (key_info->algorithm == HA_KEY_ALG_LONG_HASH)
{
key_part_end++;
key_info->flags&= ~HA_NOSAME;
}
To make it unique, the additional check is done by
check_duplicate_long_entries call from ha_write_row, and similar one from
ha_update_row.
PRIMARY key is already forbidden, which is checked by the first test in
main.long_unique, however is_hash_field_needed was wrongly deduced to true
in mysql_prepare_create_table in this particular case.
FIX:
* Improve the check for Key::PRIMARY type
* Simplify is_hash_field_needed deduction for a more neat reading
create_table_info_t::innobase_table_flags(): Refuse to create
a PAGE_COMPRESSED table with PAGE_COMPRESSION_LEVEL=0 if also
innodb_compression_level=0.
The parameter value innodb_compression_level=0 was only somewhat
meaningful for testing or debugging ROW_FORMAT=COMPRESSED tables.
For the page_compressed format, it never made any sense, and the
check in dict_tf_is_valid_not_redundant() that was added in
72378a2583 (MDEV-12873) would cause
the server to crash.
This is a duplicate of MDEV-18278 89936f11e9, but I will add an
additional assertion
Description:
The frm corruption should not be reported during CREATE TABLE. Normally
it doesn't, and the data to fill TABLE is taken by open_table_from_share
call. However, the vcol data is stored as SQL string in
table->s->vcol_defs.str and is anyway parsed on each table open.
It is impossible [or hard] to avoid, because it's hard to clone the
expression tree in general (it's easier to parse).
Normally parse_vcol_defs should only fail on semantic errors. If so,
error_reported is set to true. Any other failure is not expected during
table creation. There is either unhandled/unacknowledged error, or
something went really wrong, like memory reject. This all should be
asserted anyway.
Solution:
* Set *error_reported=true for the forward references check;
* Assert for every unacknowledged error during table creation.
MySQL-5.7 mysql.user tables have a last_password_changed field.
Because before MariaDB-10.4 remained oblivious to this, the act of creating
users or otherwise changing a users row left the last_password_field with 0.
Running a MariaDB-10.4 instance on this would work correctly, until mysql_upgrade
is run, when this 0 value immediately translates to password expired
state.
MySQL-5.7 relied on the password_expired enum to indicate password
expiry so we aren't going to activate password that were expired in
MySQL-5.7.
Thanks Hans Borresen for the bug report and review of the fix.
ha_innobase::delete_table(): When the table that is being dropped
has a name starting with #sql, temporarily set
innodb_lock_wait_timeout=0 while attempting to lock the
persistent statistics tables. If the statistics tables cannot be locked,
pretend that statistics did not exist and carry on with dropping
the table. The SQL layer is not really prepared for failures of
this operation. This is what fixes the test case.
ha_innobase::rename_table(): When renaming a table from a name
that starts with #sql, try to lock the statistics tables with an
immediate timeout, and ignore the statistics if the locks were
not available. In fact, during any rename from a #sql name,
dict_stats_rename_table() should have no effect, because already
when an earlier rename to a #sql name took place we should have
deleted the statistics for the table using the non-#sql name.
This change is just analogous to the ha_innobase::delete_table().
MIPS (and possibly other) platforms require linking against libatomic to
support 64-bit atomic integers. Groonga was failing to do so and all related
tests were failing with an atomics relocation error on MIPS.
Contributors:
James Cowgill <jcowgill@debian.org>
On MIPS platforms (and probably others) unaligned memory access results in a
bus error. In the connect storage engine, block data for some data formats is
stored packed in memory and the TYPBLK class is used to read values from it.
Since TYPBLK does not have special handling for this packed memory, it can
quite easily result in unaligned memory accesses.
The simple way to fix this is to perform all accesses to the main buffer
through memcpy. With GCC and optimizations turned on, this call to memcpy is
completely optimized away on architectures where unaligned accesses are ok
(like x86).
Contributors:
James Cowgill <jcowgill@debian.org>
Some architectures (mips) require libatomic to support proper
atomic operations. Check first if support is available without
linking, otherwise use the library.
Contributors:
James Cowgill <jcowgill@debian.org>
Jessica Clarke <jrtc27@debian.org>
Vicențiu Ciorbaru <vicentiu@mariadb.org>
The server crashes due to passing NULL to spider_free().
In some cases, this == pt_handler_share_handlers[0] at the label
error_get_share in ha_spider::open().
In such cases, to nullify pt_handler_share_handlers[0]->wide_handler
is nothing but to nullify this->wide_handler. We should not do this
before freeing this->wide_handler.
As part of MDEV-26779 we first discovered the effect of enabling spinning for
some critical mutex. MDEV-26779 tried enabling it for lock_sys.wait_mutex and
observed a good gain in performance.
In yet another discussion, Mark Callaghan pointed a reference to pthread based
mutex spin using PTHREAD_MUTEX_ADAPTIVE_NP (MDEV-26769 Intel RTM).
Given the strong references, Marko Makela as part of his comment in #1923
pointed an idea to enable spinning for other mutexes. Based on perf profiling
we decided to explore spinning for log_sys_mutex and log_flush_order_mutex as
they are occupying the top slots in the contented mutex list.
The evaluation showed promising results for ARM64 but not for x86.
So a patch is here-by proposed to enable the spinning of the mutex for
ARM64-based platform.
Do not print illegal table field names for non-top-level SELECT list,
they will not be refered in any case but create problem for parsing
of printed result.
Problem:
========
This patch addresses two issues.
First, if a CHANGE MASTER command is issued and an error happens
while locating the replica’s relay logs, the logs can be put into an
invalid state where future updates fail and future CHANGE MASTER
calls crash the server. More specifically, right before a replica
purges the relay logs (part of the `CHANGE MASTER TO` logic), the
relay log is temporarily closed with state LOG_TO_BE_OPENED. If the
server errors in-between the temporary log closure and purge, i.e.
during the function find_log_pos, the log should be closed.
MDEV-25284 reveals the log is not properly closed.
Second, upon issuing a RESET SLAVE ALL command, a slave’s GTID
filters are not cleared (DO_DOMAIN_IDS, IGNORE_DOMIAN_IDS,
IGNORE_SERVER_IDS). MySQL had a similar bug report, Bug #18816897,
which fixed this issue to clear IGNORE_SERVER_IDS after issuing
RESET SLAVE ALL in version 5.7.
Solution:
=========
To fix the first problem, the CHANGE MASTER error handling logic was
extended to transition the relay log state to LOG_CLOSED from
LOG_TO_BE_OPENED.
To fix the second problem, the RESET SLAVE ALL logic is extended to
clear the domain_id filter and ignore_server_ids.
Reviewed By:
============
Andrei Elkin <andrei.elkin@mariadb.com>
The SQL layer never acquires metadata locks (MDL) on the tables
that the tables that DML statement accesses is modifying.
However, the storage engine must access the parent table in order to
ensure that the child table will not refer to a non-existing record
in the parent table.
During certain DDL operations, the InnoDB table metadata (dict_table_t)
may be be freed and rebuilt. This would cause a race condition with
a concurrent INSERT that is attempting to report a FOREIGN KEY violation.
We work around the insufficient MDL during DML by acquiring exclusive
InnoDB table locks on all child tables during DDL. To avoid deadlocks,
we will follow the following order of acquisition:
1. tables whose REFERENCES clauses point to the current table
2. the current table that is being subjected to DDL
3. mysql.innodb_table_stats
4. mysql.innodb_index_stats
5. the InnoDB dictionary tables (SYS_TABLES and so on)
6. exclusive dict_sys.latch
Spider accesses a freed connection in ha_spider::end_bulk_insert()
and results in SIGSEGV.
The cause of the bug is that ha_spider::is_bulk_insert_exec_period()
wrongly returns TRUE when the bulk insertion has not yet started.
Spider decides whether it is during the bulk insertion or not by
the value of insert_pos, but the variable is not reset in a case,
and this result in the bug.
The purpose of non-exclusive locks in a transaction is to guarantee
that the records covered by those locks must remain in that way until
the transaction is committed. (The purpose of gap locks is to ensure
that a record that was nonexistent will remain that way.)
Once a transaction has reached the XA PREPARE state, the only allowed
further actions are XA ROLLBACK or XA COMMIT. Therefore, it can be
argued that only the exclusive locks that the XA PREPARE transaction
is holding are essential.
Furthermore, InnoDB never preserved explicit locks across server restart.
For XA PREPARE transations, we will only recover implicit exclusive locks
for records that had been modified.
Because of the fact that XA PREPARE followed by a server restart will
cause some locks to be lost, we might as well always release all
non-exclusive locks during the execution of an XA PREPARE statement.
lock_release_on_prepare(): Release non-exclusive locks on XA PREPARE.
trx_prepare(): Invoke lock_release_on_prepare() unless the
isolation level is SERIALIZABLE or this is an internal distributed
transaction with the binlog (not actual XA PREPARE statement).
This has been discussed with Sergei Golubchik and Andrei Elkin.
Reviewed by: Sergei Golubchik
The server crashes if ALTER TABLE, which accesses physical data
placed at data nodes, is performed on a Spider table.
The cause of the bug is that spider_check_trx_and_get_conn() does
not allocate connections if sql_command == SQLCOM_ALTER_TABLE.
Some ALTER TABLE statements, like ALTER TABLE ... CHECK PARTITION,
access data nodes. So, we need to allocate a new connection before
performing ALTER TABLEs.
Schema and table names in a veiw FRM files are:
- in upper case on Linux
- in lower case on Windows
Using the LOWER() function when displaying an FRM file fragment,
to avoid the OS-specific difference.
This happens upon CREATE USER and DROP ROLE.
The underlying problem is that our HASH implementation shuffles elements
around when performing an update or delete. This means that when doing a
scan through the HASH table by index, in search of elements to delete or
update one must restart the scan to make sure nothing is missed if at least
one delete / update happened.
More specifically, what happened in this case:
The hash has 131 element, DROP ROLE removes the element
[119]. Its [119]->next was element [129], so [129] is moved to [119].
Now we need to compact the hash, removing the last element [130]. It
gets one bit off its hash value and becomes element [2]. The existing
element [2] is moved to [129], and old [130] is moved to [2].
We cannot simply move [130] to [129] and make [2]->next=130, it won't
work if [2] is itself in the collision list and doesn't belong in [2].
The handle_grant_struct code assumed that it is safe to continue by only
reexamining the currently modified / deleted element index, but that is
not true.
Missing to delete an element in the hash triggered the assertion in
the test case. DROP ROLE would not clear all necessary role->role or
role->user mappings.
To fix the problem we ensure that the scan is restarted, only if an
element was deleted / updated, similar to how bubble-sort keeps sorting
until it finds no more elements to swap.
There were two independent problems which lead to the crash
and to the non-relevant records returned in I_S queries:
- The code in the I_S implementation was not secure
about values with 0x00 bytes.
It's fixed by using check_db_name() and check_table_name()
inside make_table_name_list(), and by adding the test for
0x00 inside check_table_name().
- The code in Item_string::print() did not convert
strings without introducers when restoring
the CREATE VIEW statement from an Item tree.
This made wrong literals inside the "query" line in the view FRM file
in cases when the VIEW parse time
character_set_client!=character_set_connection.
That's fixed by adding a proper conversion.
This change also fixed a similar problem in SHOW PROCEDURE CODE -
the literals were displayed in wrong character set in SP instructions
in cases when the SP parse time
character_set_client!=character_set_connection.