The crash was caused by improper raising of an error or replication checksum
verification at time of the server initialization. As there is no THD object
associated with the main initializing thread yet the error text should be
assigned with calling a respective macro that is aware of that possibility.
Fixed accordingly.
[At merging to 10.4 the new test result file needs
+# restart: --master_verify_checksum=ON --debug_dbug=+d,corrupt_read_log_event_char
that mtr run will hint on.]
Analysis:
========
"mysqlbinlog -v" option will reconstruct row events and display them as
commented SQL statements. If this option is given twice, the output includes
comments to indicate column data types and some metadata.
`log_event_print_value` is the function reponsible for printing values and
their types. This function doesn't handle GEOMETRY type. Hence the above error
gets printed.
Fix:
===
Add support for GEOMETRY datatype.
Problem:
=======
SHOW BINLOG EVENTS FROM <"random"-pos> caused a variety of failures as
reported in MDEV-18046. They are fixed but that approach is not future-proof
as well as is not optimal to create extra check for being constructed event
parameters.
Analysis:
=========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that, read
event data, satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
For example: https://jira.mariadb.org/browse/MDEV-18046
In the bug description user executes CREATE TABLE/INSERT and ALTER SQL
commands.
When a crazy offset like "SHOW BINLOG EVENTS FROM 365" is provided code
assumes offset 365 as valid event begin and proceeds to EVENT_LEN_OFFSET reads
some random length and comes up with a crazy event which didn't exits in the
binary log. In this quoted example scenario, event read at offset 365 is
considered as "Update_rows_log_event", which is not present in binary log.
Since this is a random event its validation fails and code results in
assert/segmentation fault, as shown below.
mysqld: /data/src/10.4/sql/log_event.cc:10863: Rows_log_event::Rows_log_event(
const char*, uint, const Format_description_log_event*):
Assertion `var_header_len >= 2' failed.
181220 15:27:02 [ERROR] mysqld got signal 6 ;
#7 0x00007fa0d96abee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#8 0x000055e744ef82de in Rows_log_event::Rows_log_event (this=0x7fa05800d390,
buf=0x7fa05800d080 "", event_len=254, description_event=0x7fa058006d60) at
/data/src/10.4/sql/log_event.cc:10863
#9 0x000055e744f00cf8 in Update_rows_log_event::Update_rows_log_event
Since we are reading random data repeating the same command SHOW BINLOG EVENTS
FROM 365 produces different types of crashes with different events. MDEV-18046
reported 10 such crashes.
In order to avoid such scenarios user provided starting offset needs to be
validated for its correctness. Best way of doing this is to make use of
checksums if they are available. MDEV-18046 fix introduced the checksum based
validation.
The issue still remains in cases where binlog checksums are disabled. Please
find the following bug reports.
MDEV-22473: binlog.binlog_show_binlog_event_random_pos failed in buildbot,
server crashed in read_log_event
MDEV-22455: Server crashes in Table_map_log_event,
binlog.binlog_invalid_read_in_rotate failed in buildbot
Fix:
====
When binlog checksum is disabled, perform scan(via reading event by event), to
validate the requested FROM <pos> offset. Starting from offset 4 read the
event_length of next_event in the binary log. Using the next_event length
advance current offset to point to next event. Repeat this process till the
current offset is less than or equal to crazy offset. If current offset is
higher than crazy offset provide appropriate invalid input offset error.
(This commit is exclusively for 10.1 branch, do not merge it to upper ones)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Extra: mysqlbinlog_row_minimal refined to not produce mutable numeric values into the result file.
(This commit is for 10.3 and upper branches)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
(This commit is exclusively for 10.2 branch. Do not merge it to 10.3)
In case of a pattern of non-STMT_END-marked Rows-log-event (A) followed by
a STMT_END marked one (B) mysqlbinlog mixes up the base64 encoded rows events
with their pseudo sql representation produced by the verbose option:
BINLOG '
base64 encoded data for A
### verbose section for A
base64 encoded data for B
### verbose section for B
'/*!*/;
In effect the produced BINLOG '...' query is not valid and is rejected with the error.
Examples of this way malformed BINLOG could have been found in binlog_row_annotate.result
that gets corrected with the patch.
The issue is fixed with introduction an auxiliary IO_CACHE to hold on the verbose
comments until the terminal STMT_END event is found. The new cache is emptied
out after two pre-existing ones are done at that time.
The correctly produced output now for the above case is as the following:
BINLOG '
base64 encoded data for A
base64 encoded data for B
'/*!*/;
### verbose section for A
### verbose section for B
Thanks to Alexey Midenkov for the problem recognition and attempt to tackle,
and to Venkatesh Duggirala who produced a patch for the upstream whose
idea is exploited here, as well as to MDEV-23077 reporter LukeXwang who
also contributed a piece of a patch aiming at this issue.
Shutdown of mtr tests may be too impatient, esp on CI environment where
10 seconds of `arg` of `shutdown_server arg` may not be enough for the clean
shutdown to complete.
This is fixed to remove explicit non-zero timeout argument to
`shutdown_server` from all mtr tests. mysqltest computes 60 seconds default
value for the timeout for the argless `shutdown_server` command.
This policy is additionally ensured with a compile time assert.
Problem:
========
During point in time recovery of binary log syntax error is reported for
BEGIN statement and recovery fails.
Analysis:
=========
In MariaDB 10.3 and later, setting the sql_mode system variable to Oracle
allows the server to understand a subset of Oracle's PL/SQL language. When
sql_mode=ORACLE is set, it switches the parser from the MariaDB parser to
Oracle compatible parser. With this change 'BEGIN' is not considered as
'START TRANSACTION'. Hence the syntax error is reported.
Fix:
===
At preset 'BEGIN' query is generated from 'Gtid_log_event::print'. The current
session specific 'sql_mode' information is not present as part of
'Gtid_log_event'. If it was available then, mysqlbinlog tool can make use of
'sql_mode == ORACLE' and can output "START TRANSACTION" in this particular
mode and for other sql_modes it will write "BEGIN" as part of output. Since it
is not available 'mysqlbinlog' tool will output all 'BEGIN' statements as
'START TRANSACTION' irrespective of 'sql_mode'.
Analysis:
========
RESET MASTER TO # command deletes all binary log files listed in the index
file, resets the binary log index file to be empty, and creates a new binary
log with number #. When the user provided binary log number is greater than
the max allowed value '2147483647' server fails to generate a new binary log.
The RESET MASTER statement marks the binlog closure status as
'LOG_CLOSE_TO_BE_OPENED' and exits. Statements which follow RESET MASTER
try to write to binary log they find the log_state != LOG_CLOSED and
proceed to write to binary log cache and it results in crash.
Fix:
===
During MYSQL_BIN_LOG open, if generation of new binary log name fails then the
"log_state" needs to be marked as "LOG_CLOSED". With this further statements
will find binary log as closed and they will skip writing to the binary log.
The assert was caused by early cleanup of a user variable participant
in BINLOG @var,@var where it plays twice. That was unexpected by the base
code to clear its value prematurely.
Fixed with relocating the user var destruction after operations with
its value is over.
MDEV-18046: Assortment of crashes, assertion failures and ASAN errors in mysql_show_binlog_events
Problem:
========
SHOW BINLOG EVENTS FROM <pos> reports following assert when ASAN is enabled.
uint32 binlog_get_uncompress_len(const char*):
Assertion `(buf[0] & 0xe0) == 0x80' failed
Fix:
===
**Part11: Converted debug assert to error handler code**
Problem:
========
SHOW BINLOG EVENTS FROM <pos> causes a variety of failures, some of which are
listed below. It is not a race condition issue, but there is some
non-determinism in it.
Analysis:
========
"show binlog events from <pos>" code considers the user given position as a
valid event start position. The code starts reading data from this event start
position onwards and tries to map it to a set of known events. Each event has
a specific event structure and asserts have been added to ensure that read
event data satisfies the event specific requirements. When a random position
is supplied to "show binlog events command" the event structure specific
checks will fail and they result in assert.
Fix:
====
The fix is split into different parts. Each part addresses either an ASAN
issue or an assert/crash.
**Part1: Checksum based position validation when checksum is enabled**
Using checksum validate the very first event read at the user specified
position. If there is a checksum mismatch report an appropriate error for the
invalid event.
Relates to MDEV-17863 DROP TEMPORARY TABLE creates a transaction in
binary log on read only server
Other things:
- Fixed that insert into normal_table select from tmp_table is
replicated as row events if tmp_table doesn't exists on slave.
- Any temporary tables created under read-only mode will never be logged
to binary log. Any usage of these tables to update normal tables, even
after read-only has been disabled, will use row base logging (as the
temporary table will not be on the slave).
- Analyze, check and repair table will not be logged in read-only mode.
Other things:
- Removed not used varaibles in
MYSQL_BIN_LOG::flush_and_set_pending_rows_event.
- Set table_share->table_creation_was_logged for all normal tables.
- THD::binlog_query() now returns -1 if statement was not logged., This
is used to update table_share->table_creation_was_logged.
- Don't log admin statements in opt_readonly is set.
- Table's that doesn't have table_creation_was_logged will set binlog format to row
logging.
- Removed not needed/wrong setting of table->s->table_creation_was_logged
in create_table_from_items()
calc_field_event_length should accurately calculate the size of BLOB type
fields, Instead of returning just the bytes taken by length it should return
length bytes + actual length.
This allows one to run the test suite even if any of the following
options are changed:
- character-set-server
- collation-server
- join-cache-level
- log-basename
- max-allowed-packet
- optimizer-switch
- query-cache-size and query-cache-type
- skip-name-resolve
- table-definition-cache
- table-open-cache
- Some innodb options
etc
Changes:
- Don't print out the value of system variables as one can't depend on
them to being constants.
- Don't set global variables to 'default' as the default may not
be the same as the test was started with if there was an additional
option file. Instead save original value and reset it at end of test.
- Test that depends on the latin1 character set should include
default_charset.inc or set the character set to latin1
- Test that depends on the original optimizer switch, should include
default_optimizer_switch.inc
- Test that depends on the value of a specific system variable should
set it in the test (like optimizer_use_condition_selectivity)
- Split subselect3.test into subselect3.test and subselect3.inc to
make it easier to set and reset system variables.
- Added .opt files for test that required specfic options that could
be changed by external configuration files.
- Fixed result files in rockdsb & tokudb that had not been updated for
a while.
Analysis:
========
As part of BUG#28642318 fix, two new test cases were added. The first test
case tests a scenario where two sessions are present, in which the first
session has a regular table named 't1' and another session has a temporary
table named 't1'. Test executes a DELETE statement on regular table. These
statements are captured from binary log and replayed back on new client
connection to prove that DELETE statement is applied successfully. Note that
the binlog contains only CREATE TEMPORARY TABLE part hence a temporary table
gets created in new connection. This replaying logic is implemented by using
'--exec $MYSQL' command. If the new connection gets disconnected within the
scope of first test case the test passes, i.e the temporary table gets dropped
as part thread cleanup. But on slow platforms the connection gets closed at
the time of execution of test case 2. When the temporary table is dropped as
part thread cleanup a "DROP TEMPORARY TABLE t1" is written into the binary
log. In test case two the same sessions continue to exist and and table names
are reused to test a new bug scenario. The additional "DROP TEMPORARY TABLE"
command drops second test specific tables which results in "Unknown table"
error.
Fix:
====
Rename the second case specific table to 't2'. Even if the close connection
from test case one happens later the drop command with has
'DROP /*!40005 TEMPORARY */ TABLE IF EXISTS `t1`' will not result in an error.
MDEV-5589 commit set up a policy to skip DROP TEMPORARY TABLE binary logging
in case the target table has not been "CREATEed" in binlog (no CREATE
Query-log-event was logged into the binary log).
It turns out that
1. the rule did not cover non-existing table DROPped with IF-EXISTS clause.
The logged-create knowledge for the non-existing one does not even need
MDEV-5589 patch, and
2. connection close disobeys it to trigger automatic DROP-IF-EXISTS
binlogging.
Either 1 or 2 or even both is/are also responsible for unexpected binlog
records observed in MDEV-17863, actually rendering a referred
@@global.read_only irrelevant as far as the described stored procedure
definition *and* the ROW binlog-format are concerned.
Analysis
========
Point in time recovery using mysqlbinlog containing queries
operating on temporary tables results in an error.
While writing the query log event in the binary log, the
thread id used for execution of DROP TABLE and DELETE commands
were incorrect. The thread variable 'thread_specific_used'
is used to determine whether a specific thread id is to used
while executing the statements i.e using 'SET
@@session.pseudo_thread_id'. This variable was not set
correctly for DROP TABLE query and was never set for DELETE
query. The thread id is important for temporary tables
since the tables are session specific. DROP TABLE and DELETE
queries executed using a wrong thread id resulted in errors
while applying the queries generated by mysqlbinlog utility.
Fix
===
Set the 'thread_specific_used' THD variable for DROP TABLE and
DELETE queries.
ReviewBoard: 21833