- Some of the bug fixes are backports from 10.5!
- The fix in innobase/fil/fil0fil.cc is just a backport to get less
error messages in mysqld.1.err when running with valgrind.
- Renamed HAVE_valgrind_or_MSAN to HAVE_valgrind
When using field_conv(), which is called in case of field1=field2 copy in
fill_records(), full varstring's was copied, including unitialized bytes.
This caused valgrind to compilain about usage of unitialized bytes when
using Aria static length records.
Fixed by not using memcpy when copying varstrings but instead just copy
the real bytes.
- Removed not needed bzero in void TABLE::initialize_quick_structures().
- Replaced bzero with TRASH_ALLOC() to have this change verfied with
memory checkers
- Added missing table->quick_keys.is_set in table_cond_selectivity()
- service not using "--defaults-file" can have any name not just "MySQL"
- service with "--defaults-file", without datadir in them
use default datadir (install_root\data)
MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The only exception is the
C runtime library. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.
In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.
Note: Before MariaDB Server 10.5, ./mtr will typically fail
due to the old PCRE library, which was updated in MDEV-14024.
The following cmake options were tested on 10.5
in commit 94d0bb4dbe:
cmake \
-DCMAKE_C_FLAGS='-march=native -O2' \
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2' \
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug \
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF \
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO \
-DWITH_SAFEMALLOC=OFF \
-DWITH_{ZLIB,SSL,PCRE}=bundled \
-DHAVE_LIBAIO_H=0 \
-DWITH_MSAN=ON
MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and __msan_unpoison().
MEM_GET_VBITS(), MEM_SET_VBITS(): Aliases for
VALGRIND_GET_VBITS(), VALGRIND_SET_VBITS(), __msan_copy_shadow().
InnoDB: Replace the UNIV_MEM_ macros with corresponding MEM_ macros.
ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
mariabackup tries to allocate a buffer of page_size*page_size/4 size.
for 64k page it means 1Gb, which doesn't work very well on 32-bit builders.
Skip the 64k page test on 32bit.
* fix FindLZ4 to follow convention (LIBRARIES, not LIBRARY)
* remove redundant checks from rocksdb/CMakeLists.txt
* put all checks through the same macro that uniformly
checks for a package, prints the message, adds definition
The library finder needs to have capitals in its name so that FIND_PACKAGE
will load the correct finder and actually detect that libzstd is available.
Without this change the CMake would just always silently skip ZSTD since
it would never find it.
Simplify Debian autopkgtest RocksDB part and make it more verbose so that
future regressions like this are easier to debug.
Also remove QUIET from the RocksDB FIND_PACKAGE call so that it is easier
to read in build logs what libraries were detected. Also add missing
underscores to error messages.
The issue here is for a DEPENDENT subquery that has an aggregate function in the ORDER BY clause,
is wrapped inside an Item_aggregate_ref. For computation of ORDER BY we need to refer to the
temp table field corresponding to this item. But in the function make_sortorder, we were
explicitly casting Item_aggrgate_ref to Item_sum, which leads to us not getting the temp
table field corresponding to the item.
This function is very common in a debug build. I can even see it in
profiler.
This patch reduces execution time of fil_validate() from
8948ns
8367ns
8650ns
8906ns
8448ns
to
260ns
232ns
403ns
275ns
169ns
in my environment.
The trick is a faster fil_space_t iteration. Hash table
is typically initialized with a size of 50,000. And looping through
it is slow. Slower, than iterating an exact amount of fil_space_t
which is typically less than ten.
Only debug builds are affected.
The issue here was that the left expr and right expr of the ANY subquery
had different character sets, so we were converting the left expr to utf8 character set.
So when this conversion was happening we were actually converting the item inside the cache,
it looked like <cache>(convert(t1.l1 using utf8)), which is incorrect.
To fix this problem we are going to store the reference of the left expr and convert that
to utf8 character set, it would look like convert(<cache>(`test`.`t1`.`l1`) using utf8)
Fix the following valgrind error.
==94390== Thread 29:
==94390== Invalid read of size 8
==94390== at 0x78389D: thd_increment_bytes_sent (sql_class.cc:4265)
==94390== by 0xC8EC46: net_real_write (net_serv.cc:730)
==94390== by 0xC8E0C8: net_flush (net_serv.cc:383)
==94390== by 0xC8E4D0: net_write_command (net_serv.cc:521)
==94390== by 0xADCE61: cli_advanced_command (client.c:468)
==94390== by 0xAE3CAF: mysql_close_slow_part (client.c:3671)
==94390== by 0xAE3D28: mysql_close (client.c:3683)
==94390== by 0x149E69A8: spider_db_mbase::disconnect() (spd_db_mysql.cc:2217)
==94390== by 0x1491EA26: spider_db_disconnect(st_spider_conn*) (spd_db_conn.cc:297)
==94390== by 0x14948EBE: spider_free_conn_alloc(st_spider_conn*) (spd_conn.cc:196)
==94390== by 0x1494B26A: spider_free_conn(st_spider_conn*) (spd_conn.cc:1251)
==94390== by 0x1494941F: spider_free_conn_from_trx(st_spider_transaction*, st_spider_conn*, bool, bool, int*) (spd_conn.cc:315)
==94390== Address 0x1f0e0990 is 4,832 bytes inside a block of size 25,728 free'd
==94390== at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==94390== by 0x13F5545: my_free (my_malloc.c:222)
==94390== by 0x6C75B7: ilink::operator delete(void*, unsigned long) (sql_list.h:618)
==94390== by 0x77B9F6: THD::~THD() (sql_class.cc:1724)
==94390== by 0x1494FCE0: spider_bg_conn_action(void*) (spd_conn.cc:2580)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Block was alloc'd at
==94390== at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==94390== by 0x13F4DFA: my_malloc (my_malloc.c:101)
==94390== by 0x1491CF06: ilink::operator new(unsigned long) (sql_list.h:614)
==94390== by 0x1494F7FD: spider_bg_conn_action(void*) (spd_conn.cc:2501)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Invalid write of size 8
==94390== at 0x7838AF: thd_increment_bytes_sent (sql_class.cc:4265)
==94390== by 0xC8EC46: net_real_write (net_serv.cc:730)
==94390== by 0xC8E0C8: net_flush (net_serv.cc:383)
==94390== by 0xC8E4D0: net_write_command (net_serv.cc:521)
==94390== by 0xADCE61: cli_advanced_command (client.c:468)
==94390== by 0xAE3CAF: mysql_close_slow_part (client.c:3671)
==94390== by 0xAE3D28: mysql_close (client.c:3683)
==94390== by 0x149E69A8: spider_db_mbase::disconnect() (spd_db_mysql.cc:2217)
==94390== by 0x1491EA26: spider_db_disconnect(st_spider_conn*) (spd_db_conn.cc:297)
==94390== by 0x14948EBE: spider_free_conn_alloc(st_spider_conn*) (spd_conn.cc:196)
==94390== by 0x1494B26A: spider_free_conn(st_spider_conn*) (spd_conn.cc:1251)
==94390== by 0x1494941F: spider_free_conn_from_trx(st_spider_transaction*, st_spider_conn*, bool, bool, int*) (spd_conn.cc:315)
==94390== Address 0x1f0e0990 is 4,832 bytes inside a block of size 25,728 free'd
==94390== at 0x4C2ACBD: free (vg_replace_malloc.c:530)
==94390== by 0x13F5545: my_free (my_malloc.c:222)
==94390== by 0x6C75B7: ilink::operator delete(void*, unsigned long) (sql_list.h:618)
==94390== by 0x77B9F6: THD::~THD() (sql_class.cc:1724)
==94390== by 0x1494FCE0: spider_bg_conn_action(void*) (spd_conn.cc:2580)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
==94390== Block was alloc'd at
==94390== at 0x4C29BC3: malloc (vg_replace_malloc.c:299)
==94390== by 0x13F4DFA: my_malloc (my_malloc.c:101)
==94390== by 0x1491CF06: ilink::operator new(unsigned long) (sql_list.h:614)
==94390== by 0x1494F7FD: spider_bg_conn_action(void*) (spd_conn.cc:2501)
==94390== by 0x4E3DDD4: start_thread (in /usr/lib64/libpthread-2.17.so)
==94390== by 0x5FBFEAC: clone (in /usr/lib64/libc-2.17.so)
This issue is pretty much the same as MDEV-20213.
The fix is similar to:
3c238ac51c52c4abbff2
Check::validate(): fix a debug assertion
SysTablespace::open_or_create(): protect assigning to a shared
variable with a mutex
follow up
fil_system.sys_space is a shared variable between the thread
which assigns a value to it, and the thread which does Check::validate()
SysTablespace::open_or_create(): protect a shared variable with
a mutex to avoid any data race surprises.
Check::validate(): Relax a debug assertion. TRX_SYS_SPACE fil_space_t
can be created and became visible to this assertion before
fil_system.sys_space becomes initialized
Problem:
========
Relay_log_info::flush reports following MSAN issue.
==17820==WARNING: MemorySanitizer: use-of-uninitialized-value is reported
#5 0x00005584f0981441 in my_write (Filedes=22,
Buffer=0x72500003e818 "5\n./slave-relay-bin.000003\n21385\n
master-bin.000001\n21643\n0\n", '\245' <repeats 141 times>..., Count=118,
MyFlags=532) at /home/sujatha/bug_repo/test-10.5-msan/mysys/my_write.c:49
Analysis:
=========
In parallel replication at the end of each statement execution the worker execution
status is updated in 'relay-log.info' file. When two workers try to flush
the status at the same time, since the write to cache is not serialized both
workers write to the same address simultaneously and increment the
length twice. Because of this the length of buffer is more than actual data.
When flush code tries to read the buffer beyond valid data length MSAN
reports uninitialized values error.
Fix:
===
Serialize the relay log flush operation using "rli->data_lock".
Deadlock in DbugParse, on Linux.
In 10.1, DBUG recursive mutex was improperly implemented.
CODE_STATE::locked counter was never updated.
Copy the code around LockMutex/UnlockMutex from 10.2
Analysis:
========
When "Profiling" is enabled, server collects the resource usage of each
statement that gets executed in current session. Profiling doesn't support
nested statements. In order to ensure this behavior when profiling is enabled
for a statement, there should not be any other active query which is being
profiled. This active query information is stored in 'current' variable. When
a nested query arrives it finds 'current' being not NULL and server aborts.
When 'init_connect' and 'init_slave' system variables are set they contain a
set of statements to be executed. "execute_init_command" is the function call
which invokes "dispatch_command" for each statement provided in
'init_connect', 'init_slave' system variables. "execute_init_command" invokes
"start_new_query" and it passes the statement list to "dispatch_command". This
"dispatch_command" intern invokes "start_new_query" which leads to nesting of
queries. Hence '!current' assert is triggered.
Fix:
===
Remove profiling from "execute_init_command" as it will be done within
"dispatch_command" execution.
Starting from 10.3, the optimizer is able to detect that entire outer join
nests are constants (because of "Impossible ON") and remove them (see
mark_join_nest_as_const)
However, this was not properly accounted for in NESTED_JOIN structure
and the way check_interleaving_with_nj() uses its n_tables member to
check if the join prefix order is allowed.
(The result was that the optimizer could conclude that no join prefix is
allowed and fail an assertion)
The galera.galera_parallel_autoinc_manytrx mtr test opens and runs test
scenario through 3 connections to node 1 and one connection to node 2.
In the test initialization phase, the test creates two tables 't1' and 'ten'
and then creates a stored procedure 'p1' to operate on these tables.
These 3 create DDL statements are issued through same connection to node 1.
In the next test phase, the mtr script uses send command to launch the call
for the p1 stored procedure through all 3 connections to node 1 and through
one connection to node 2. As the mtr send command is asynchronous,
this test phase is non blocking and fast operation.
Now, if the replication between nodes is slow, it may happen that the
initialization phase DDL statements have not been received or have not been
fully applied in node 2. Therefore there is no guarantee that the test tables
and the stored procedure have been created in node 2. Yet, the test is trying
to call p1 in node 2.
In the failure case error logs, there is error message
"MTR failed: query 'reap' failed: 1305: PROCEDURE test.p1 does not exist"
The reap command through connection to node 2, is the first place where test
execution may observe that test tables and/or stored procedure are not yet
created in node 2.
The fix in this commit adds a wait condition in connection to node 2, to wait
until the stored procedure is created before calling the stored procedure.
The wait is implemented by looking in information_schema.routines for the p1
stored procedure.
MDEV-22140 galera.galera_drop_database MTR failed: InnoDB: MySQL is trying to drop database `fts`.`` though there are still open handles
Add wait conditions to wait that all operations are done in both
nodes.
On FreeBSD, perl isn't in /usr/bin, its in /usr/local/bin or
elsewhere in the path.
Like storage/{maria/unittest/,}ma_test_* , we use /usr/bin/env to
find perl and run it.
Item_func_json_extract did not implement val_decimal(),
so CAST(JSON_EXTRACT('{"x":true}', '$.x') AS DECIMAL) erroneously
returned 0 with a warning because of convertion from the string "true"
to decimal.
Implementing val_decimal(), so boolean values are correctly handled.
This feature adds the support for rr in mtr. These 2 options are added
--rr run the mysqld in rr record mode
--rr_option= run the rr with custom record option, for multiple
options use --rr_option= for each option.
For example
./mtr main.view --rr_option=-h --rr_option=-u --rr_option=-c=23
--boot-rr run the mysqld performing bootstrap in rr record mode
Recording are stored in mysql-test/var/rr folder.
To run recording please run
rr replay var/rr/mysql-X
Limitations
Restart will create a new recording.
Repeat will work on same recording , So might be harder to debug.
If test create the multiple instance of mariadb all will be stored in var/rr