PFS_atomic class contains wrappers around my_atomic_* operations, which
are macros to GNU atomic operations (__atomic_*). Due to different
implementations of compilers, clang may encounter errors when compiling
on x86_32 architecture.
The following functions are replaced with C++ std::atomic type in
performance schema code base:
- PFS_atomic::store_*()
-> my_atomic_store*
-> __atomic_store_n()
=> std::atomic<T>::store()
- PFS_atomic::load_*()
-> my_atomic_load*
-> __atomic_load_n()
=> std::atomic<T>::load()
- PFS_atomic::add_*()
-> my_atomic_add*
-> __atomic_fetch_add()
=> std::atomic<T>::fetch_add()
- PFS_atomic::cas_*()
-> my_atomic_cas*
-> __atomic_compare_exchange_n()
=> std::atomic<T>::compare_exchange_strong()
and PFS_atomic class could be dropped completely.
Note that in the wrapper memory order passed to original GNU atomic
extensions are hard-coded as `__ATOMIC_SEQ_CST`, which is equivalent to
`std::memory_order_seq_cst` in C++, and is the default parameter for
std::atomic_* functions.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.
Apart from better performance when accessing thread local variables,
we'll get rid of things that depend on initialization/cleanup of
pthread_key_t variables.
Where appropriate, use compiler-dependent pre-C++11 thread-local
equivalents, where it makes sense, to avoid initialization check overhead
that non-static thread_local can suffer from.
A few different incorrect function type UBSAN issues have been
grouped into this patch.
The only real potentially undefined behavior is an error about
show_func_mutex_instances_lost, which when invoked in
sql_show.cc::show_status_array(), puts 5 arguments onto the stack;
however, the implementing function only actually has 3 parameters (so
only 3 would be popped). This was fixed by adding in the remaining
parameters to satisfy the type mysql_show_var_func.
The rest of the findings are pointer type mismatches that wouldn't
lead to actual undefined behavior. The lf_hash_initializer function
type definition is
typedef void (*lf_hash_initializer)(LF_HASH *hash, void *dst, const void *src);
but the MDL_lock and table cache's implementations of this function
do not have that signature. The MDL_lock has specific MDL object
parameters:
static void lf_hash_initializer(LF_HASH *hash __attribute__((unused)),
MDL_lock *lock, MDL_key *key_arg)
and the table cache has specific TDC parameters:
static void tdc_hash_initializer(LF_HASH *,
TDC_element *element, LEX_STRING *key)
leading to UBSAN runtime errors when invoking these functions.
This patch fixes these type mis-matches by changing the
implementing functions to use void * and const void * for their
respective parameters, and later casting them to their expected
type in the function body.
Note too the functions tdc_hash_key and tc_purge_callback had
a similar problem to tdc_hash_initializer and was fixed
similarly.
Reviewed By:
============
Sergei Golubchik <serg@mariadb.com>
Added SHOW_LONGLONG_NOFLUSH to mark the variables that should not be
flushed.
New variables cleared as part of SHOW STATUS:
- Rpl_semi_sync_master_request_ack
- Rpl_semi_sync_master_get_ac
- Rpl_semi_sync_slave_send_ack
- Slave_skipped_error
I checked all stack overflow potential problems found with
gcc -Wstack-usage=16384
and
clang -Wframe-larger-than=16384 -no-inline
Fixes:
Added '#pragma clang diagnostic ignored "-Wframe-larger-than="'
to a lot of function to where stack usage large but resonable.
- Added stack check warnings to BUILD scrips when using clang and debug.
Function changed to use malloc instead allocating things on stack:
- read_bootstrap_query() now allocates line_buffer (20000 bytes) with
malloc() instead of using stack. This has a small performance impact
but this is not releant for bootstrap.
- mroonga grn_select() used 65856 bytes on stack. Changed it to use
malloc().
- Wsrep_schema::replay_transaction() and
Wsrep_schema::recover_sr_transactions().
- Connect zipOpen3()
Not fixed:
- mroonga/vendor/groonga/lib/expr.c grn_proc_call() uses
43712 byte on stack. However this is not easy to fix as the stack
used is caused by a lot of code generated by defines.
- Most changes in mroonga/groonga where only adding of pragmas to disable
stack warnings.
- rocksdb/options/options_helper.cc uses 20288 of stack space.
(no reason to fix except to get rid of the compiler warning)
- Causes using alloca() where the allocation size is resonable.
- An issue in libmariadb (reported to connectors).
This patch also fixes:
MDEV-33050 Build-in schemas like oracle_schema are accent insensitive
MDEV-33084 LASTVAL(t1) and LASTVAL(T1) do not work well with lower-case-table-names=0
MDEV-33085 Tables T1 and t1 do not work well with ENGINE=CSV and lower-case-table-names=0
MDEV-33086 SHOW OPEN TABLES IN DB1 -- is case insensitive with lower-case-table-names=0
MDEV-33088 Cannot create triggers in the database `MYSQL`
MDEV-33103 LOCK TABLE t1 AS t2 -- alias is not case sensitive with lower-case-table-names=0
MDEV-33109 DROP DATABASE MYSQL -- does not drop SP with lower-case-table-names=0
MDEV-33110 HANDLER commands are case insensitive with lower-case-table-names=0
MDEV-33119 User is case insensitive in INFORMATION_SCHEMA.VIEWS
MDEV-33120 System log table names are case insensitive with lower-cast-table-names=0
- Removing the virtual function strnncoll() from MY_COLLATION_HANDLER
- Adding a wrapper function CHARSET_INFO::streq(), to compare
two strings for equality. For now it calls strnncoll() internally.
In the future it will turn into a virtual function.
- Adding new accent sensitive case insensitive collations:
- utf8mb4_general1400_as_ci
- utf8mb3_general1400_as_ci
They implement accent sensitive case insensitive comparison.
The weight of a character is equal to the code point of its
upper case variant. These collations use Unicode-14.0.0 casefolding data.
The result of
my_charset_utf8mb3_general1400_as_ci.strcoll()
is very close to the former
my_charset_utf8mb3_general_ci.strcasecmp()
There is only a difference in a couple dozen rare characters, because:
- the switch from "tolower" to "toupper" comparison, to make
utf8mb3_general1400_as_ci closer to utf8mb3_general_ci
- the switch from Unicode-3.0.0 to Unicode-14.0.0
This difference should be tolarable. See the list of affected
characters in the MDEV description.
Note, utf8mb4_general1400_as_ci correctly handles non-BMP characters!
Unlike utf8mb4_general_ci, it does not treat all BMP characters
as equal.
- Adding classes representing names of the file based database objects:
Lex_ident_db
Lex_ident_table
Lex_ident_trigger
Their comparison collation depends on the underlying
file system case sensitivity and on --lower-case-table-names
and can be either my_charset_bin or my_charset_utf8mb3_general1400_as_ci.
- Adding classes representing names of other database objects,
whose names have case insensitive comparison style,
using my_charset_utf8mb3_general1400_as_ci:
Lex_ident_column
Lex_ident_sys_var
Lex_ident_user_var
Lex_ident_sp_var
Lex_ident_ps
Lex_ident_i_s_table
Lex_ident_window
Lex_ident_func
Lex_ident_partition
Lex_ident_with_element
Lex_ident_rpl_filter
Lex_ident_master_info
Lex_ident_host
Lex_ident_locale
Lex_ident_plugin
Lex_ident_engine
Lex_ident_server
Lex_ident_savepoint
Lex_ident_charset
engine_option_value::Name
- All the mentioned Lex_ident_xxx classes implement a method streq():
if (ident1.streq(ident2))
do_equal();
This method works as a wrapper for CHARSET_INFO::streq().
- Changing a lot of "LEX_CSTRING name" to "Lex_ident_xxx name"
in class members and in function/method parameters.
- Replacing all calls like
system_charset_info->coll->strcasecmp(ident1, ident2)
to
ident1.streq(ident2)
- Taking advantage of the c++11 user defined literal operator
for LEX_CSTRING (see m_strings.h) and Lex_ident_xxx (see lex_ident.h)
data types. Use example:
const Lex_ident_column primary_key_name= "PRIMARY"_Lex_ident_column;
is now a shorter version of:
const Lex_ident_column primary_key_name=
Lex_ident_column({STRING_WITH_LEN("PRIMARY")});
to iterate over all status variables one should use
LOCK_all_status_vars not LOCK_status
this fixes sporadic mutex lock inversion in plugins.password_reuse_check:
* acl_cache->lock is taken over complex operations that might increment
status counters (under LOCK_status).
* acl_cache->lock is needed to get the values of Acl% status variables
when iterating over status variables
Under terms of MDEV 27490 we'll add support for non-BMP identifiers
and upgrade casefolding information to Unicode version 14.0.0.
In Unicode-14.0.0 conversion to lower and upper cases can increase octet length
of the string, so conversion won't be possible in-place any more.
This patch removes virtual functions performing in-place casefolding:
- my_charset_handler_st::casedn_str()
- my_charset_handler_st::caseup_str()
and fixes the code to use the non-inplace functions instead:
- my_charset_handler_st::casedn()
- my_charset_handler_st::caseup()
This column (`COUNT_TRANSACTIONS_RETRIES`) is defined as `BIGINT UNSIGNED`
(64-bit unsigned integer) in the user-visible SQL definition:
182ff21ace/storage/perfschema/table_replication_applier_status.cc (L66)
"COUNT_TRANSACTIONS_RETRIES BIGINT unsigned not null comment 'The number of retries that were made because the replication SQL thread failed to apply a transaction.',"
And its value is internally set/updated using the `set_field_ulonglong`
function:
182ff21ace/storage/perfschema/table_replication_applier_status.cc (L231-L233)
case 3: /* total number of times transactions were retried */
set_field_ulonglong(f, m_row.count_transactions_retries);
break;
… but the structure where it is stored allocates only `ulong` for it:
182ff21ace/storage/perfschema/table_replication_applier_status.h (L62)
ulong count_transactions_retries;
As a result of this inconsistency:
1. On any platform where `ulong` is `uint32_t` and `ulonglong` is `uint64_t`,
setting this value would corrupt the 4 bytes of memory *following* the 4
bytes actually allocated for it.
Likely this problem was never noticed because this is the final element in
the structure, and the structure is padded by the compiler to prevent
memory corruption errors.
2. On any BIG-ENDIAN platform where `ulong` is `uint32_t` and `ulonglong`
is `uint64_t`, reading back the value of this column will result in
total garbage.
Likely this problem was never noticed because MariaDB has not been
tested on 32-bit big-endian platforms.
In order not to affect the user-visible/SQL definition of this column, the
correct way to fix this issue is to change it to `ulonglong` in the
structure definition. See
https://github.com/MariaDB/server/pull/2763/files#r1329110832 for the
original identification and discussion of this issue.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the BSD-new
license. I am contributing on behalf of my employer Amazon Web Services