The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
The assertion failed in handler::ha_reset upon SELECT under
READ UNCOMMITTED from table with index on virtual column.
This was the debug-only failure, though the problem is mush wider:
* MY_BITMAP is a structure containing my_bitmap_map, the latter is a raw
bitmap.
* read_set, write_set and vcol_set of TABLE are the pointers to MY_BITMAP
* The rest of MY_BITMAPs are stored in TABLE and TABLE_SHARE
* The pointers to the stored MY_BITMAPs, like orig_read_set etc, and
sometimes all_set and tmp_set, are assigned to the pointers.
* Sometimes tmp_use_all_columns is used to substitute the raw bitmap
directly with all_set.bitmap
* Sometimes even bitmaps are directly modified, like in
TABLE::update_virtual_field(): bitmap_clear_all(&tmp_set) is called.
The last three bullets in the list, when used together (which is mostly
always) make the program flow cumbersome and impossible to follow,
notwithstanding the errors they cause, like this MDEV-17556, where tmp_set
pointer was assigned to read_set, write_set and vcol_set, then its bitmap
was substituted with all_set.bitmap by dbug_tmp_use_all_columns() call,
and then bitmap_clear_all(&tmp_set) was applied to all this.
To untangle this knot, the rule should be applied:
* Never substitute bitmaps! This patch is about this.
orig_*, all_set bitmaps are never substituted already.
This patch changes the following function prototypes:
* tmp_use_all_columns, dbug_tmp_use_all_columns
to accept MY_BITMAP** and to return MY_BITMAP * instead of my_bitmap_map*
* tmp_restore_column_map, dbug_tmp_restore_column_maps to accept
MY_BITMAP* instead of my_bitmap_map*
These functions now will substitute read_set/write_set/vcol_set directly,
and won't touch underlying bitmaps.
MDEV-19486 and one more similar bug appeared because handler::write_row() interface
welcomes to modify buffer by storage engine. But callers are not ready for that
thus bugs are possible in future.
handler::write_row():
handler::ha_write_row(): make argument const
Archive storage engine assumed that any query that attempts to read from
the table will call ha_archive::info() beforehand. ha_archive would flush
un-written data in that call (this would make it visible for the reads).
Break this assumption. Flush the data when the table is opened for reading.
This way, one can do multiple write statements without causing a flush, but
as soon as we might need the data, we flush it.
This patch implements engine independent unique hash index.
Usage:- Unique HASH index can be created automatically for blob/varchar/test column whose key
length > handler->max_key_length()
or it can be explicitly specified.
Automatic Creation:-
Create TABLE t1 (a blob unique);
Explicit Creation:-
Create TABLE t1 (a int , unique(a) using HASH);
Internal KEY_PART Representations:-
Long unique key_info will have 2 representations.
(lets understand this with an example create table t1(a blob, b blob , unique(a, b)); )
1. User Given Representation:- key_info->key_part array will be similar to what user has defined.
So in case of example it will have 2 key_parts (a, b)
2. Storage Engine Representation:- In this case there will be only one key_part and it will point to
HASH_FIELD. This key_part will be always after user defined key_parts.
So:- User Given Representation [a] [b] [hash_key_part]
key_info->key_part ----^
Storage Engine Representation [a] [b] [hash_key_part]
key_info->key_part ------------^
Table->s->key_info will have User Given Representation, While table->key_info will have Storage Engine
Representation.Representation can be changed into each other by calling re/setup_keyinfo_hash function.
Working:-
1. So when user specifies HASH_INDEX or key_length is > handler->max_key_length(), In mysql_prepare_create_table
One extra vfield is added (for each long unique key). And key_info->algorithm is set to HA_KEY_ALG_LONG_HASH.
2. In init_from_binary_frm_image values for hash_keypart is set (like fieldnr , field and flags)
3. In parse_vcol_defs, HASH_FIELD->vcol_info is created. Item_func_hash is used with list of Item_fields,
When Explicit length is given by user then Item_left is used to concatenate Item_field values.
4. In ha_write_row/ha_update_row check_duplicate_long_entry_key is called which will create the hash key from
table->record[0] and then call ha_index_read_map , if we found duplicated hash , we will compare the result
field by field.
Part of MDEV-5336 Implement LOCK FOR BACKUP
The idea is that instead of waiting in close_cached_tables() for all
tables to be closed, we instead call flush_tables() that does:
- Flush not used objects in table cache to free memory
- Collect all tables that are open
- Call HA_EXTRA_FLUSH on the objects, to get them into "closed state"
- Added HA_EXTRA_FLUSH support to archive and CSV
- Added multi-user protection to HA_EXTRA_FLUSH in MyISAM and Aria
The benefit compared to old code is:
- FTWRL doesn't have to wait for long running read operations or
open HANDLER's
Handle string length as size_t, consistently (almost always:))
Change function prototypes to accept size_t, where in the past
ulong or uint were used. change local/member variables to size_t
when appropriate.
This fix excludes rocksdb, spider,spider, sphinx and connect for now.
- Added sql/mariadb.h file that should be included first by files in sql
directory, if sql_plugin.h is not used (sql_plugin.h adds SHOW variables
that must be done before my_global.h is included)
- Removed a lot of include my_global.h from include files
- Removed include's of some files that my_global.h automatically includes
- Removed duplicated include's of my_sys.h
- Replaced include my_config.h with my_global.h
The fix is about filling the space beyond the end of VARCHAR values with zeroes.
VARCHAR NULLs already has its buffer filled with zeros. So when the VARCHAR fields
are not NULL, then we explicitly fill the buffer with zeros.
- Changed ER(ER_...) to ER_THD(thd, ER_...) when thd was known or if there was many calls to current_thd in the same function.
- Changed ER(ER_..) to ER_THD_OR_DEFAULT(current_thd, ER...) in some places where current_thd is not necessary defined.
- Removing calls to current_thd when we have access to thd
Part of this is optimization (not calling current_thd when not needed),
but part is bug fixing for error condition when current_thd is not defined
(For example on startup and end of mysqld)
Notable renames done as otherwise a lot of functions would have to be changed:
- In JOIN structure renamed:
examined_rows -> join_examined_rows
record_count -> join_record_count
- In Field, renamed new_field() to make_new_field()
Other things:
- Added DBUG_ASSERT(thd == tmp_thd) in Item_singlerow_subselect() just to be safe.
- Removed old 'tab' prefix in JOIN_TAB::save_explain_data() and use members directly
- Added 'thd' as argument to a few functions to avoid calling current_thd.
The reason for the failure was a bug in an include file on debian that causes 'struct stat'
to have different sized depending on the environment.
This patch fixes so that we always include my_global.h or my_config.h before we include any other files.
Other things:
- Removed #include <my_global.h> in some include files; Better to always do this at the top level to have as few
"always-include-this-file-first' files as possible.
- Removed usage of some include files that where already included by my_global.h or by other files.
client/mysql_plugin.c:
Use my_global.h first
client/mysqlslap.c:
Remove duplicated include files
extra/comp_err.c:
Remove duplicated include files
include/m_string.h:
Remove duplicated include files
include/maria.h:
Remove duplicated include files
libmysqld/emb_qcache.cc:
Use my_global.h first
plugin/semisync/semisync.h:
Use my_pthread.h first
sql/datadict.cc:
Use my_global.h first
sql/debug_sync.cc:
Use my_global.h first
sql/derror.cc:
Use my_global.h first
sql/des_key_file.cc:
Use my_global.h first
sql/discover.cc:
Use my_global.h first
sql/event_data_objects.cc:
Use my_global.h first
sql/event_db_repository.cc:
Use my_global.h first
sql/event_parse_data.cc:
Use my_global.h first
sql/event_queue.cc:
Use my_global.h first
sql/event_scheduler.cc:
Use my_global.h first
sql/events.cc:
Use my_global.h first
sql/field.cc:
Use my_global.h first
Remove duplicated include files
sql/field_conv.cc:
Use my_global.h first
sql/filesort.cc:
Use my_global.h first
Remove duplicated include files
sql/gstream.cc:
Use my_global.h first
sql/ha_ndbcluster.cc:
Use my_global.h first
sql/ha_ndbcluster_binlog.cc:
Use my_global.h first
sql/ha_ndbcluster_cond.cc:
Use my_global.h first
sql/ha_partition.cc:
Use my_global.h first
sql/handler.cc:
Use my_global.h first
sql/hash_filo.cc:
Use my_global.h first
sql/hostname.cc:
Use my_global.h first
sql/init.cc:
Use my_global.h first
sql/item.cc:
Use my_global.h first
sql/item_buff.cc:
Use my_global.h first
sql/item_cmpfunc.cc:
Use my_global.h first
sql/item_create.cc:
Use my_global.h first
sql/item_geofunc.cc:
Use my_global.h first
sql/item_inetfunc.cc:
Use my_global.h first
sql/item_row.cc:
Use my_global.h first
sql/item_strfunc.cc:
Use my_global.h first
sql/item_subselect.cc:
Use my_global.h first
sql/item_sum.cc:
Use my_global.h first
sql/item_timefunc.cc:
Use my_global.h first
sql/item_xmlfunc.cc:
Use my_global.h first
sql/key.cc:
Use my_global.h first
sql/lock.cc:
Use my_global.h first
sql/log.cc:
Use my_global.h first
sql/log_event.cc:
Use my_global.h first
sql/log_event_old.cc:
Use my_global.h first
sql/mf_iocache.cc:
Use my_global.h first
sql/mysql_install_db.cc:
Remove duplicated include files
sql/mysqld.cc:
Remove duplicated include files
sql/net_serv.cc:
Remove duplicated include files
sql/opt_range.cc:
Use my_global.h first
sql/opt_subselect.cc:
Use my_global.h first
sql/opt_sum.cc:
Use my_global.h first
sql/parse_file.cc:
Use my_global.h first
sql/partition_info.cc:
Use my_global.h first
sql/procedure.cc:
Use my_global.h first
sql/protocol.cc:
Use my_global.h first
sql/records.cc:
Use my_global.h first
sql/records.h:
Don't include my_global.h
Better to do this at the upper level
sql/repl_failsafe.cc:
Use my_global.h first
sql/rpl_filter.cc:
Use my_global.h first
sql/rpl_gtid.cc:
Use my_global.h first
sql/rpl_handler.cc:
Use my_global.h first
sql/rpl_injector.cc:
Use my_global.h first
sql/rpl_record.cc:
Use my_global.h first
sql/rpl_record_old.cc:
Use my_global.h first
sql/rpl_reporting.cc:
Use my_global.h first
sql/rpl_rli.cc:
Use my_global.h first
sql/rpl_tblmap.cc:
Use my_global.h first
sql/rpl_utility.cc:
Use my_global.h first
sql/set_var.cc:
Added comment
sql/slave.cc:
Use my_global.h first
sql/sp.cc:
Use my_global.h first
sql/sp_cache.cc:
Use my_global.h first
sql/sp_head.cc:
Use my_global.h first
sql/sp_pcontext.cc:
Use my_global.h first
sql/sp_rcontext.cc:
Use my_global.h first
sql/spatial.cc:
Use my_global.h first
sql/sql_acl.cc:
Use my_global.h first
sql/sql_admin.cc:
Use my_global.h first
sql/sql_analyse.cc:
Use my_global.h first
sql/sql_audit.cc:
Use my_global.h first
sql/sql_base.cc:
Use my_global.h first
sql/sql_binlog.cc:
Use my_global.h first
sql/sql_bootstrap.cc:
Use my_global.h first
Use my_global.h first
sql/sql_cache.cc:
Use my_global.h first
sql/sql_class.cc:
Use my_global.h first
sql/sql_client.cc:
Use my_global.h first
sql/sql_connect.cc:
Use my_global.h first
sql/sql_crypt.cc:
Use my_global.h first
sql/sql_cursor.cc:
Use my_global.h first
sql/sql_db.cc:
Use my_global.h first
sql/sql_delete.cc:
Use my_global.h first
sql/sql_derived.cc:
Use my_global.h first
sql/sql_do.cc:
Use my_global.h first
sql/sql_error.cc:
Use my_global.h first
sql/sql_explain.cc:
Use my_global.h first
sql/sql_expression_cache.cc:
Use my_global.h first
sql/sql_handler.cc:
Use my_global.h first
sql/sql_help.cc:
Use my_global.h first
sql/sql_insert.cc:
Use my_global.h first
sql/sql_lex.cc:
Use my_global.h first
sql/sql_load.cc:
Use my_global.h first
sql/sql_locale.cc:
Use my_global.h first
sql/sql_manager.cc:
Use my_global.h first
sql/sql_parse.cc:
Use my_global.h first
sql/sql_partition.cc:
Use my_global.h first
sql/sql_plugin.cc:
Added comment
sql/sql_prepare.cc:
Use my_global.h first
sql/sql_priv.h:
Added error if we use this before including my_global.h
This check is here becasue so many files includes sql_priv.h first.
sql/sql_profile.cc:
Use my_global.h first
sql/sql_reload.cc:
Use my_global.h first
sql/sql_rename.cc:
Use my_global.h first
sql/sql_repl.cc:
Use my_global.h first
sql/sql_select.cc:
Use my_global.h first
sql/sql_servers.cc:
Use my_global.h first
sql/sql_show.cc:
Added comment
sql/sql_signal.cc:
Use my_global.h first
sql/sql_statistics.cc:
Use my_global.h first
sql/sql_table.cc:
Use my_global.h first
sql/sql_tablespace.cc:
Use my_global.h first
sql/sql_test.cc:
Use my_global.h first
sql/sql_time.cc:
Use my_global.h first
sql/sql_trigger.cc:
Use my_global.h first
sql/sql_udf.cc:
Use my_global.h first
sql/sql_union.cc:
Use my_global.h first
sql/sql_update.cc:
Use my_global.h first
sql/sql_view.cc:
Use my_global.h first
sql/sys_vars.cc:
Added comment
sql/table.cc:
Use my_global.h first
sql/thr_malloc.cc:
Use my_global.h first
sql/transaction.cc:
Use my_global.h first
sql/uniques.cc:
Use my_global.h first
sql/unireg.cc:
Use my_global.h first
sql/unireg.h:
Removed inclusion of my_global.h
storage/archive/ha_archive.cc:
Added comment
storage/blackhole/ha_blackhole.cc:
Use my_global.h first
storage/csv/ha_tina.cc:
Use my_global.h first
storage/csv/transparent_file.cc:
Use my_global.h first
storage/federated/ha_federated.cc:
Use my_global.h first
storage/federatedx/federatedx_io.cc:
Use my_global.h first
storage/federatedx/federatedx_io_mysql.cc:
Use my_global.h first
storage/federatedx/federatedx_io_null.cc:
Use my_global.h first
storage/federatedx/federatedx_txn.cc:
Use my_global.h first
storage/heap/ha_heap.cc:
Use my_global.h first
storage/innobase/handler/handler0alter.cc:
Use my_global.h first
storage/maria/ha_maria.cc:
Use my_global.h first
storage/maria/unittest/ma_maria_log_cleanup.c:
Remove duplicated include files
storage/maria/unittest/test_file.c:
Added comment
storage/myisam/ha_myisam.cc:
Move sql_plugin.h first as this includes my_global.h
storage/myisammrg/ha_myisammrg.cc:
Use my_global.h first
storage/oqgraph/oqgraph_thunk.cc:
Use my_config.h and my_global.h first
One could not include my_global.h before oqgraph_thunk.h (don't know why)
storage/spider/ha_spider.cc:
Use my_global.h first
storage/spider/hs_client/config.cpp:
Use my_global.h first
storage/spider/hs_client/escape.cpp:
Use my_global.h first
storage/spider/hs_client/fatal.cpp:
Use my_global.h first
storage/spider/hs_client/hstcpcli.cpp:
Use my_global.h first
storage/spider/hs_client/socket.cpp:
Use my_global.h first
storage/spider/hs_client/string_util.cpp:
Use my_global.h first
storage/spider/spd_conn.cc:
Use my_global.h first
storage/spider/spd_copy_tables.cc:
Use my_global.h first
storage/spider/spd_db_conn.cc:
Use my_global.h first
storage/spider/spd_db_handlersocket.cc:
Use my_global.h first
storage/spider/spd_db_mysql.cc:
Use my_global.h first
storage/spider/spd_db_oracle.cc:
Use my_global.h first
storage/spider/spd_direct_sql.cc:
Use my_global.h first
storage/spider/spd_i_s.cc:
Use my_global.h first
storage/spider/spd_malloc.cc:
Use my_global.h first
storage/spider/spd_param.cc:
Use my_global.h first
storage/spider/spd_ping_table.cc:
Use my_global.h first
storage/spider/spd_sys_table.cc:
Use my_global.h first
storage/spider/spd_table.cc:
Use my_global.h first
storage/spider/spd_trx.cc:
Use my_global.h first
storage/xtradb/handler/handler0alter.cc:
Use my_global.h first
storage/xtradb/handler/i_s.cc:
Use my_global.h first