MemorySanitizer (clang -fsanitize=memory) requires that all code
be compiled with instrumentation enabled. The C runtime library
is an exception. Failure to use instrumented libraries will cause
bogus messages about memory being uninitialized.
In WITH_MSAN builds, we must avoid calling getservbyname(),
because even though it is a standard library function, it is
not instrumented, not even in clang 10.
The following cmake options were tested:
-DCMAKE_C_FLAGS='-march=native -O2'
-DCMAKE_CXX_FLAGS='-stdlib=libc++ -march=native -O2'
-DWITH_EMBEDDED_SERVER=OFF -DWITH_UNIT_TESTS=OFF -DCMAKE_BUILD_TYPE=Debug
-DWITH_INNODB_{BZIP2,LZ4,LZMA,LZO,SNAPPY}=OFF
-DPLUGIN_{ARCHIVE,TOKUDB,MROONGA,OQGRAPH,ROCKSDB,CONNECT,SPIDER}=NO
-DWITH_SAFEMALLOC=OFF
-DWITH_{ZLIB,SSL,PCRE}=bundled
-DHAVE_LIBAIO_H=0
-DWITH_MSAN=ON
MEM_MAKE_DEFINED(): An alias for VALGRIND_MAKE_MEM_DEFINED()
and in the future, __msan_unpoison().
For now, neither MEM_MAKE_DEFINED() nor MEM_UNDEFINED()
perform any action under MSAN. Enabling them will catch more bugs, but
will also require some more fixes or work-arounds.
Json_writer::add_double(): Work around a frequently occurring
failure in optimizer tests, related to EXPLAIN FORMAT=JSON.
dtoa(): Disable MSAN altogether. For some reason, this function
is triggering a lot of trouble, especially when invoked for
DBUG functions. The MDL default timeout is dd=86400 seconds,
and for some reason it is claimed to be uninitialized.
InnoDB: Define UNIV_DEBUG_VALGRIND also WITH_MSAN.
ut_crc32_8_hw(), ut_crc32_64_low_hw(): Use the compiler built-in
functions instead of inline assembler when building WITH_MSAN.
This will require at least -msse4.2 when building for IA-32 or AMD64.
The inline assembler would not be instrumented, and would thus cause
bogus failures.
Valgrind only seems to complain about memcmp() operations that
actually end up reading uninitialized data, while MemorySanitizer
requires that the entire length of both buffers be defined.
During the test main.query_cache_innodb, only 16 bytes of
db_buf are initialized during the memcmp() in
dict_acquire_mdl_shared<false>(), but db_len was wrongly set to 20 bytes.
Something similar was fixed in MDEV-21344, but only for the table name,
in commit 0e25a8b4a6.
dict_table_t::parse_name(): Assign the return value of
filename_to_tablename() to the output parameters for lengths.
There is no need to invoke strlen().
With OpenSSL < 1.1 there is a potential for a race condition to occur.
This can cause the S3 engine to crash. The workaround is to add locking
callbacks to OpenSSL so that this doesn't happen.
https://curl.haxx.se/libcurl/c/threadsafe.html
There is a fix in libMariaS3 for this which when a certain flag is set
(HAVE_CURL_OPENSSL_UNSAFE) will add the required locks.
This patch adds CMake support so that the flag is set if it is found
that Curl is compiled with an unsafe OpenSSL version. For example Ubuntu
16.04 with libcurl4-openssl-dev.
page_cur_insert_rec_low(): Remove a bogus condition that wrongly
omitted redo logging when the record contains no data payload bytes.
We can have such records in secondary indexes, when the values of
the PRIMARY KEY column(s) are the empty string, and the values of
secondary key column(s) are are NULL or the empty string.
page_apply_delete_dynamic(): Improve the consistency check, and
do not allow adjacent records to be less than 5 bytes apart from
each other. The fixed-size part of the record header is 5 bytes.
Usually there must also be some header or payload bytes, but in
an extreme case where all columns are CHAR(0) NOT NULL, the
minimum secondary index record size is 5 bytes, and the table can
contain at most 1 row. The minimum clustered index record size is
5+6+7 bytes (header, DB_TRX_ID, DB_ROLL_PTR) or x+5+4 bytes
(fixed-size header, child page number, and some additional header
or payload bytes).
- multi_range_read_info_const now uses the new records_in_range interface
- Added handler::avg_io_cost()
- Don't calculate avg_io_cost() in get_sweep_read_cost if avg_io_cost is
not 1.0. In this case we trust the avg_io_cost() from the handler.
- Changed test_quick_select to use TIME_FOR_COMPARE instead of
TIME_FOR_COMPARE_IDX to align this with the rest of the code.
- Fixed bug when using test_if_cheaper_ordering where we didn't use
keyread if index was changed
- Fixed a bug where we didn't use index only read when using order-by-index
- Added keyread_time() to HEAP.
The default keyread_time() was optimized for blocks and not suitable for
HEAP. The effect was the HEAP prefered table scans over ranges for btree
indexes.
- Fixed get_sweep_read_cost() for HEAP tables
- Ensure that range and ref have same cost for simple ranges
Added a small cost (MULTI_RANGE_READ_SETUP_COST) to ranges to ensure
we favior ref for range for simple queries.
- Fixed that matching_candidates_in_table() uses same number of records
as the rest of the optimizer
- Added avg_io_cost() to JT_EQ_REF cost. This helps calculate the cost for
HEAP and temporary tables better. A few tests changed because of this.
- heap::read_time() and heap::keyread_time() adjusted to not add +1.
This was to ensure that handler::keyread_time() doesn't give
higher cost for heap tables than for normal tables. One effect of
this is that heap and derived tables stored in heap will prefer
key access as this is now regarded as cheap.
- Changed cost for index read in sql_select.cc to match
multi_range_read_info_const(). All index cost calculation is now
done trough one function.
- 'ref' will now use quick_cost for keys if it exists. This is done
so that for '=' ranges, 'ref' is prefered over 'range'.
- scan_time() now takes avg_io_costs() into account
- get_delayed_table_estimates() uses block_size and avg_io_cost()
- Removed default argument to test_if_order_by_key(); simplifies code
Prototype change:
- virtual ha_rows records_in_range(uint inx, key_range *min_key,
- key_range *max_key)
+ virtual ha_rows records_in_range(uint inx, const key_range *min_key,
+ const key_range *max_key,
+ page_range *res)
The handler can ignore the page_range parameter. In the case the handler
updates the parameter, the optimizer can deduce the following:
- If previous range's last key is on the same block as next range's first
key
- If the current key range is in one block
- We can also assume that the first and last block read are cached!
This can be used for a better calculation of IO seeks when we
estimate the cost of a range index scan.
The parameter is fully implemented for MyISAM, Aria and InnoDB.
A separate patch will update handler::multi_range_read_info_const() to
take the benefits of this change and also remove the double
records_in_range() calls that are not anymore needed.
One may access freed THD members after LOCK_thd_kill is released.
With original code it can happen when killing wsrep-disabled thread on a
wsrep-enabled server. With 91ab42a8 it is happening on a wsrep-disabled
server.
page_cur_insert_rec_low(): Check the array bounds before comparing.
We used to read one byte beyond the end of the 'rec' payload.
The incorrect logic was originally introduced in
commit 7ae21b18a6
and modified in commit 138cbec5f2.
Backport to 10.4:
- Don't try to push down SELECTs that have a side effect
- In case the storage engine did support pushdown of SELECT with an INTO
clause, write the rows we've got from it into select->join->result,
and not thd->protocol. This way, SELECT ... INTO ... FROM
smart_engine_table will put the result into where instructed, and
NOT send it to the client.
main.mysql_upgrade_noengine did not do "FLUSH PRIVILEGES" after restoring
the original backed global_priv table. So following tests might fail on
lack of some privileges.
Adding the FLUSH PRIVILEGES statement.
The cause was an uninitalized variable on the slave when reading a dummy
event that can only be generated by the test.
Fixed by ensuring that flag2 is always initialized.
Fixed also some indentation issues and improved comments.
The test main.mysqltest could crash or hang with
cmake -DWITH_ASAN=ON builds. The reason appears to be
a memory leak, which was found out by manually invoking
echo --replace_regex a > file
ASAN_OPTIONS=log_path=/dev/shm/asan mysqltest ... < file
and then examining the /dev/shm/asan.* file.
commit 121a5e8d07 revised the function
buf_pool_watch_unset() in such a way that the debug field
buf_page_t::in_page_hash is no longer protected by buf_pool.mutex
and thus not safe to access by the debug assertion in
buf_pool_watch_set().
For now, let us revert the change to buf_pool_watch_unset()
and have it acquire the buf_pool.mutex for a longer time.
This was done to both simplify the code and also to be easier to handle
storage engines that are clustered on some other index than the primary
key.
As pk_is_clustering_key() and is_clustering_key now are using only
index_flags, these where removed from all storage engines.
Changes:
- Initalize Aria early to allow it to load mysql.plugin table with --help
- Don't print 'aborting' when doing --help
- Don't write 'loose' error messages on log_warning < 2 (2 is default)
- Don't write warnings about disabled plugings when doing --help
- Don't write aria_log_control or aria log files when doing --help
- When using --help, open all Aria tables in read only mode (safety)
- If aria_init() fails, do a cleanup(). (Frees used memory)
- If aria_log_control is locked with --help, then don't wait 30 seconds
but instead return at once without initialzing Aria plugin.
MDEV-21604
Added "virtual" low level write function encrypt_or_write that is set
to point to either normal or encrypted write functions.
This patch also fixes a possible memory leak if writing to binary log fails.
MDEV-21605 Clean up and speed up interfaces for binary row logging
MDEV-21617 Bug fix for previous version of this code
The intention is to have as few 'if' as possible in ha_write() and
related functions. This is done by pre-calculating once per statement the
row_logging state for all tables.
Benefits are simpler and faster code both when binary logging is disabled
and when it's enabled.
Changes:
- Added handler->row_logging to make it easy to check it table should be
row logged. This also made it easier to disabling row logging for system,
internal and temporary tables.
- The tables row_logging capabilities are checked once per "statements
that updates tables" in THD::binlog_prepare_for_row_logging() which
is called when needed from THD::decide_logging_format().
- Removed most usage of tmp_disable_binlog(), reenable_binlog() and
temporary saving and setting of thd->variables.option_bits.
- Moved checks that can't change during a statement from
check_table_binlog_row_based() to check_table_binlog_row_based_internal()
- Removed flag row_already_logged (used by sequence engine)
- Moved binlog_log_row() to a handler::
- Moved write_locked_table_maps() to THD::binlog_write_table_maps() as
most other related binlog functions are in THD.
- Removed binlog_write_table_map() and binlog_log_row_internal() as
they are now obsolete as 'has_transactions()' is pre-calculated in
prepare_for_row_logging().
- Remove 'is_transactional' argument from binlog_write_table_map() as this
can now be read from handler.
- Changed order of 'if's in handler::external_lock() and wsrep_mysqld.h
to first evaluate fast and likely cases before more complex ones.
- Added error checking in ha_write_row() and related functions if
binlog_log_row() failed.
- Don't clear check_table_binlog_row_based_result in
clear_cached_table_binlog_row_based_flag() as it's not needed.
- THD::clear_binlog_table_maps() has been replaced with
THD::reset_binlog_for_next_statement()
- Added 'MYSQL_OPEN_IGNORE_LOGGING_FORMAT' flag to open_and_lock_tables()
to avoid calculating of binary log format for internal opens. This flag
is also used to avoid reading statistics tables for internal tables.
- Added OPTION_BINLOG_LOG_OFF as a simple way to turn of binlog temporary
for create (instead of using THD::sql_log_bin_off.
- Removed flag THD::sql_log_bin_off (not needed anymore)
- Speed up THD::decide_logging_format() by remembering if blackhole engine
is used and avoid a loop over all tables if it's not used
(the common case).
- THD::decide_logging_format() is not called anymore if no tables are used
for the statement. This will speed up pure stored procedure code with
about 5%+ according to some simple tests.
- We now get annotated events on slave if a CREATE ... SELECT statement
is transformed on the slave from statement to row logging.
- In the original code, the master could come into a state where row
logging is enforced for all future events if statement could be used.
This is now partly fixed.
Other changes:
- Ensure that all tables used by a statement has query_id set.
- Had to restore the row_logging flag for not used tables in
THD::binlog_write_table_maps (not normal scenario)
- Removed injector::transaction::use_table(server_id_type sid, table tbl)
as it's not used.
- Cleaned up set_slave_thread_options()
- Some more DBUG_ENTER/DBUG_RETURN, code comments and minor indentation
changes.
- Ensure we only call THD::decide_logging_format_low() once in
mysql_insert() (inefficiency).
- Don't annotate INSERT DELAYED
- Removed zeroing pos_in_table_list in THD::open_temporary_table() as it's
already 0
MDEV-21606 Improve update handler (long unique keys on blobs)
MDEV-21470 MyISAM and Aria start_bulk_insert doesn't work with long unique
MDEV-21606 Bug fix for previous version of this code
MDEV-21819 2 Assertion `inited == NONE || update_handler != this'
- Move update_handler from TABLE to handler
- Move out initialization of update handler from ha_write_row() to
prepare_for_insert()
- Fixed that INSERT DELAYED works with update handler
- Give an error if using long unique with an autoincrement column
- Added handler function to check if table has long unique hash indexes
- Disable write cache in MyISAM and Aria when using update_handler as
if cache is used, the row will not be inserted until end of statement
and update_handler would not find conflicting rows.
- Removed not used handler argument from
check_duplicate_long_entries_update()
- Syntax cleanups
- Indentation fixes
- Don't use single character indentifiers for arguments
- Only indentation changes in sql_rename.cc
- Ignore some WSREP error messages when there isn't a internet connection
- Force restart of stat_tables_part.test to make result stable
- Fixed compiler warnings in CONNECT
MDEV-19964 S3 replication support
Added new configure options:
s3_slave_ignore_updates
"If the slave has shares same S3 storage as the master"
s3_replicate_alter_as_create_select
"When converting S3 table to local table, log all rows in binary log"
This allows on to configure slaves to have the S3 storage shared or
independent from the master.
Other thing:
Added new session variable '@@sql_if_exists' to force IF_EXIST to DDL's.