Problem:
=======
The ALWAYS option of the mariadb-binlog --base64-output flag
formats its output incorrectly. This option is deprecated, and
MySQL 8.0 has removed it entirely.
Solution:
========
Adhere to MySQL and remove this option from MariaDB.
Behavioral Changes:
==================
Use Case: ./mariadb-binlog --base64-output
Previous Behavior: Sets base64-output mode to always
New Behavior: Error message indicating incomplete argument
Use Case: ./mariadb-binlog --base64-output=always
Previous Behavior: Sets base64-output mode to always
New Behavior: Error message indicating invalid argument value
Reviewed By:
==========
Andrei Elkin: <andrei.elkin@mariadb.com>
The easiest way to compile and test the server with UBSAN is to run:
./BUILD/compile-pentium64-ubsan
and then run mysql-test-run.
After this commit, one should be able to run this without any UBSAN
warnings. There is still a few compiler warnings that should be fixed
at some point, but these do not expose any real bugs.
The 'special' cases where we disable, suppress or circumvent UBSAN are:
- ref10 source (as here we intentionally do some shifts that UBSAN
complains about.
- x86 version of optimized int#korr() methods. UBSAN do not like unaligned
memory access of integers. Fixed by using byte_order_generic.h when
compiling with UBSAN
- We use smaller thread stack with ASAN and UBSAN, which forced me to
disable a few tests that prints the thread stack size.
- Verifying class types does not work for shared libraries. I added
suppression in mysql-test-run.pl for this case.
- Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is
safe to have overflows (two cases, in item_func.cc).
Things fixed:
- Don't left shift signed values
(byte_order_generic.h, mysqltest.c, item_sum.cc and many more)
- Don't assign not non existing values to enum variables.
- Ensure that bool and enum values are properly initialized in
constructors. This was needed as UBSAN checks that these types has
correct values when one copies an object.
(gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...)
- Ensure we do not called handler functions on unallocated objects or
deleted objects.
(events.cc, sql_acl.cc).
- Fixed bugs in Item_sp::Item_sp() where we did not call constructor
on Query_arena object.
- Fixed several cast of objects to an incompatible class!
(Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc,
sql_select.cc ...)
- Ensure we do not do integer arithmetic that causes over or underflows.
This includes also ++ and -- of integers.
(Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...)
- Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that
value_type is initialized to this instead of to -1, which is not a valid
enum value for json_value_types.
- Ensure we do not call memcpy() when second argument could be null.
- Fixed that Item_func_str::make_empty_result() creates an empty string
instead of a null string (safer as it ensures we do not do arithmetic
on null strings).
Other things:
- Changed struct st_position to an OBJECT and added an initialization
function to it to ensure that we do not copy or use uninitialized
members. The change to a class was also motived that we used "struct
st_position" and POSITION randomly trough the code which was
confusing.
- Notably big rewrite in sql_acl.cc to avoid using deleted objects.
- Changed in sql_partition to use '^' instead of '-'. This is safe as
the operator is either 0 or 0x8000000000000000ULL.
- Added check for select_nr < INT_MAX in JOIN::build_explain() to
avoid bug when get_select() could return NULL.
- Reordered elements in POSITION for better alignment.
- Changed sql_test.cc::print_plan() to use pointers instead of objects.
- Fixed bug in find_set() where could could execute '1 << -1'.
- Added variable have_sanitizer, used by mtr. (This variable was before
only in 10.5 and up). It can now have one of two values:
ASAN or UBSAN.
- Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked
it virtual. This was an effort to get UBSAN to work with loaded storage
engines. I kept the change as the new place is better.
- Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast
in tabutil.cpp.
- Added HAVE_REPLICATION around usage of rgi_slave, to get embedded
server to compile with UBSAN. (Patch from Marko).
- Added #ifdef for powerpc64 to avoid a bug in old gcc versions related
to integer arithmetic.
Changes that should not be needed but had to be done to suppress warnings
from UBSAN:
- Added static_cast<<uint16_t>> around shift to get rid of a LOT of
compiler warnings when using UBSAN.
- Had to change some '/' of 2 base integers to shift to get rid of
some compile time warnings.
Reviewed by:
- Json changes: Alexey Botchkov
- Charset changes in ctype-uca.c: Alexander Barkov
- InnoDB changes & Embedded server: Marko Mäkelä
- sql_acl.cc changes: Vicențiu Ciorbaru
- build_explain() changes: Sergey Petrunia
One should not change the program arguments!
This change also reduces warnings from the icc compiler.
Almost all changes are just syntax changes (adding const to
'get_one_option function' declarations).
Other changes:
- Added a few cast of 'argument' from 'const char*' to 'char *'. This
was mainly in calls to 'external' functions we don't have control of.
- Ensure that all reset of 'password command line argument' are similar.
(In almost all cases it was just adding a comment and a cast)
- In mysqlbinlog.cc and mysqld.cc there was a few cases that changed
the command line argument. These places where changed to instead allocate
the option in a MEM_ROOT to avoid changing the argument. Some of this
code was changed to ensure that different programs did parsing the
same way. Added a test case for the changes in mysqlbinlog.cc
- Changed a few variables that took their value from command line options
from 'char *' to 'const char *'.
a. The change makes `mariadb-upgrade` detect if `MYSQL_JSON` data type is needed.
b. Install the data type if it's not installed.
c. Uninstalls the data type once finished.
d. Create `.opt` and `.inc` files `have_type_mysql_json` and adapt the
tests
Reviewed by: vicentiu@mariadb.org
statistics calculation
Analysis: When --replace or --insert-ignore is not given, dumping of
mysql.innodb_index_stats and mysql.innodb_table_stats will result into race
condition.
Fix: Check if these options are present with --system=stats (because dumping
under --system=stats is safe). Otherwise, dump only structure, ignoring data
because innodb will recalculate data anyway.
Dump sequences first.
This atch made to keep it small and
to keep number of queries to the server the same.
Order of tables in a dump can not be changed
(except sequences first) because mysql_list_tables
uses SHOW TABLES and I used SHOW FULL TABLES.
statistics calculation
mysqldump --system=stats and --system=timezones by default used
ordinary INSERT statements populate EITS, innodb stats, and timezone tables.
As these all have primary keys it could result in conflict.
The behavior desired with --system= is to replace the tables.
As such we assume --replace for the purposes of stats and timezone tables
there if --insert-ignore isn't specified.
This follows up commit
commit 94a520ddbe and
commit 7c5519c12d.
After these changes, the default test suites on a
cmake -DWITH_UBSAN=ON build no longer fail due to passing
null pointers as parameters that are declared to never be null,
but plenty of other runtime errors remain.
Add --system={all, users, plugins, udfs, servers, stats, timezones}
This will dump system information from the server in
a logical form like:
* CREATE USER
* GRANT
* SET DEFAULT ROLE
* CREATE ROLE
* CREATE SERVER
* INSTALL PLUGIN
* CREATE FUNCTION
"stats" is the innodb statistics tables or EITS and
these are dumped as INSERT/REPLACE INTO statements
without recreating the table.
"timezones" is the collection of timezone tables
which are important to transfer to generate identical
results on restoration.
Two other options have an effect on the SQL generated by
--system=all. These are mutually exclusive of each other.
* --replace
* --insert-ignore
--replace will include "OR REPLACE" into the logical form
like:
* CREATE OR REPLACE USER ...
* DROP ROLE IF EXISTS (MySQL-8.0+)
* CREATE OR REPLACE ROLE ...
* UNINSTALL PLUGIN IF EXISTS (10.4+) ... (before INSTALL PLUGIN)
* DROP FUNCTION IF EXISTS (MySQL-5.7+)
* CREATE OR REPLACE [AGGREGATE] FUNCTION
* CREATE OR REPLACE SERVER
--insert-ignore uses the construct " IF NOT EXISTS" where
supported in the logical syntax.
'CREATE OR REPLACE USER' includes protection against
being run as the same user that is importing the mysqldump.
Includes experimental support for dumping mysql-5.7/8.0
system tables and exporting logical SQL compatible with MySQL.
Updates mysqldump man page, including this information and
(removing obsolute bug reference)
Reviewed-by: anel@mariadb.org
- Original patch was contributed by Jani Tolonen <jani.k.tolonen@gmail.com>
https://github.com/an3l/server/commits/bb-10.3-anel-MDEV-21786-dump-sequence
which distinguishes data structure (linked list) of sequences from
tables.
- Added standard sql output to prevent future changes
of sequences and disabled locks for sequences.
- Added test case for `MDEV-20070: mysqldump won't work correct on
sequences` where table column depends on sequence value.
- Restore sequence last value in the following way:
- Find `next_not_cached_value` and use it to `setval()`
- We just need for logical restore, so don't execute `setval()`
- `setval()` should be showed also in case of `--no-data` option.
Reviewed-by: daniel@mariadb.org
When including a generated file, always use <...>.
We need the compiler to find it in the BINDIR, not in the SRCDIR.
But when including as "..." SRCDIR is always searched first.
The bug can only happen in out-of-source builds, if there was an
in-source build before.
(This commit is exclusively for 10.1 branch, do not merge it to upper ones)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Extra: mysqlbinlog_row_minimal refined to not produce mutable numeric values into the result file.
(This commit is for 10.3 and upper branches)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
(This commit is exclusively for 10.2 branch. Do not merge it to 10.3)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.